ABSTRACT
Tagging has emerged as a popular means to annotate on-line objects such as bookmarks, photos and videos. Tags vary in semantic meaning and can describe different aspects of a media object. Tags describe the content of the media as well as locations, dates, people and other associated meta-data. Being able to automatically classify tags into semantic categories allows us to understand better the way users annotate media objects and to build tools for viewing and browsing the media objects. In this paper we present a generic method for classifying tags using third party open content resources, such as Wikipedia and the Open Directory. Our method uses structural patterns that can be extracted from resource meta-data. We describe the implementation of our method on Wikipedia using WordNet categories as our classification schema and ground truth. Two structural patterns found in Wikipedia are used for training and classification: categories and templates. We apply our system to classifying Flickr tags. Compared to a WordNet baseline our method increases the coverage of the Flickr vocabulary by 115%. We can classify many important entities that are not covered by WordNet, such as, London Eye, Big Island, Ronaldinho, geocaching and wii.
- S. Auer and J. Lehmann. What have Innsbruck and Leipzig in common? In Proc. of ESWC, pages 503--517, 2007. Google ScholarDigital Library
- R. Bunescu and M. Pasca. Using encyclopedic knowledge for named entity disambiguation. In Proc. of EACL, pages 9--16, 2006.Google Scholar
- D. Buscaldi, P. Rosso, and P. García. Inferring geographic ontologies from multiple resources for geographical information retrieval. In Proc. of the SIGIR workshop on GIR, pages 53--55, 2006.Google Scholar
- P. Clough, A. Al-Maskari, and K. Darwish. Providing multilingual access to Flickr for arabic users. In Proc. of CLEF, 2006. Google ScholarDigital Library
- S. Cucerzan. Large-scale named entity disambiguation based on Wikipedia data. In Proc. of EMNLP-CoNLL, pages 708--716, 2007.Google Scholar
- DBpedia. http://dbpedia.org/. Accessed 5 Dec 08.Google Scholar
- Delicious. http://del.icio.us/. Accessed 5 Dec 08.Google Scholar
- Flickr. http://www.Flickr.com/. Accessed 5 Dec 08.Google Scholar
- FlickrAPI. http://www.flickr.com/services/api/. Accessed 5 Dec 08.Google Scholar
- T. Joachims. Making large-scale SVM learning practical. In Advances in Kernal Methods - Support Vector Learning, pages 41--56, 1998.Google Scholar
- R. Mihalcea. Using wikipedia for automatic word sense disambiguation. In Proc. of NAACL, pages 196--203, 2007.Google Scholar
- S. Overell and S. Rüger. Geographic co-occurrence as a tool for GIR. In Proc. of the CIKM workshop on GIR, 2007. Google ScholarDigital Library
- T. Rattenbury, N. Good, and M. Naaman. Towards automatic extraction of event and place semantics from flickr tags. In Proc. of SIGIR, pages 103--110, 2007. Google ScholarDigital Library
- M. Ruiz-Casado, E. Alfonseca, and P. Castells. Automatic assignment of Wikipedia encyclopedic entries to WordNet synsets. In Proc. of AWIC, pages 380--386, 2005. Google ScholarDigital Library
- P. Schmitz. Inducing an ontology from flickr tags. In Proc. of the Workshop on Collaborative Web Tagging at WWW'06, 2006.Google Scholar
- B. Sigurbjörnsson and R. van Zwol. Flickr tag recommendation based on collective knowledge. In Proc. of WWW'08, pages 327--336, 2008. Google ScholarDigital Library
- F. Suchanek, G. Kasneci, and G. Weikem. YAGO: A core of semantic knowledge unifying WordNet and Wikipedia. In Proc. of WWW'07, pages 697--706, 2007. Google ScholarDigital Library
- TagExplorer. http://sandbox.yahoo.com/TagExplorer. Accessed 5 Dec 08.Google Scholar
- G. Weaver, B. Strickland, and G. Crane. Quantifying the accuracy of relational statements in Wikipedia: A methodology. In Proc. of JCDL, pages 358--358, 2006. Google ScholarDigital Library
- Wikipedia. http://www.wikipedia.org/. Accessed 5 Dec 08.Google Scholar
- WordNet. http://wordnet.princeton.edu/. Accessed 5 Dec 08.Google Scholar
- P. Yee, K. Swearingen, K. Li, and M. Hearst. Faceted metadata for image search and browsing. In Proc. of ACM CHI, pages 401--408, 2003. Google ScholarDigital Library
- YouTube. http://youtube.com/. Accessed 5 Dec 08.Google Scholar
Index Terms
- Classifying tags using open content resources
Recommendations
On the "localness" of user-generated content
CSCW '10: Proceedings of the 2010 ACM conference on Computer supported cooperative workThe "localness" of participation in repositories of user-generated content (UGC) with geospatial components has been cited as one of UGC's greatest benefits. However, the degree of localness in major UGC repositories such as Flickr and Wikipedia has ...
Which tags are related to visual content?
MMM'10: Proceedings of the 16th international conference on Advances in Multimedia ModelingPhoto sharing services allow user to share one's photos on the Web, as well as to annotate the photos with tags. Such web sites currently cumulate large volume of images and abundant tags. These resources have brought forth a lot of new research topics. ...
Mining Tags from Flickr User Comments Using a Hybrid Ranking Model
ICSS '15: Proceedings of the 2015 International Conference on Service ScienceIn the Web2.0 era, user generated content has become the main source of information of many popular websites such as Flickr. In Flickr, each user can share his/her photos and browse others' easily. Tagging system is an important approach to the photo ...
Comments