Beyond Co-occurrence: Discovering and Visualizing Tag Relationships from Geo-spatial and Temporal Similarities

Haipeng Zhang, Mohammed Korayem, Erkang You, David Crandall

Studying relationships between keyword tags on social sharing websites has become a popular topic of research, in order to both improve tag suggestion systems and to automatically find connections between the concepts that tags represent. Existing approaches to discovering tag relationships mainly rely on tag co-occurrences, ignoring the other sources of information available on social sharing websites. In this paper, we show how to find similar tags by comparing their distributions over time and space, discovering tags with similar geographic and temporal patterns of use. In particular, we apply this technique to find related keyword tags on a dataset of tens of millions of geo-tagged, time-stamped photos downloaded from Flickr. Geo-spatial, temporal and geo-temporal distributions of tags are extracted and represented as vectors which can then be compared and clustered. We show that we are able to successfully cluster Flickr photo tags based on their geographic and temporal patterns, and evaluate the results both qualitatively and quantitatively using a panel of human judges. A case study suggests that our visualizations of temporal and geographical semantics help humans recognize subtle semantic relationships between tags. This approach to finding and visualizing similar tags is potentially useful for exploring and aggregating any data having geographic and temporal annotations.

Results

  •  Geo Clusters, ranked by average second moment, with visualizations

some examples:

some examples:

Papers and presentations

BibTeX entries:

@inproceedings{cooccur2012wsdm,
    author = {Haipeng Zhang and Mohammed Korayem and Erkang You and David Crandall},
    title = {Beyond Co-occurrence: Discovering and Visualizing Tag Relationships from Geo-spatial and Temporal Similarities},
    booktitle = {ACM International Conference on Web Search and Data Mining (WSDM)},
    year = {2012}
}

Acknowledgements:

We thank Prof. Andrew Hanson for discussions and advice on visualization, and the anonymous reviewers for their helpful comments. We also gratefully acknowledge the support of the following:

Lilly Endowment
Lilly Endowment IU Data to Insight Center
The IU Computer Vision Lab's projects and activities have been funded, in part, by grants and contracts from the Air Force Office of Scientific Research (AFOSR), the Defense Threat Reduction Agency (DTRA), Dzyne Technologies, EgoVid, Inc., ETRI, Facebook, Google, Grant Thornton LLP, IARPA, the Indiana Innovation Institute (IN3), the IU Data to Insight Center, the IU Office of the Vice Provost for Research through an Emerging Areas of Research grant, the IU Social Sciences Research Commons, the Lilly Endowment, NASA, National Science Foundation (IIS-1253549, CNS-1834899, CNS-1408730, BCS-1842817, CNS-1744748, IIS-1257141, IIS-1852294), NVidia, ObjectVideo, Office of Naval Research (ONR), Pixm, Inc., and the U.S. Navy. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of the U.S. Government, or any sponsor.