ABSTRACT
With the prosperity of tourism and Web 2.0 technologies, more and more people have willingness to share their travel experiences on the Web (e.g., weblogs, forums, or Web 2.0 communities). These so-called travelogues contain rich information, particularly including location-representative knowledge such as attractions (e.g., Golden Gate Bridge), styles (e.g., beach, history), and activities (e.g., diving, surfing). The location-representative information in travelogues can greatly facilitate other tourists' trip planning, if it can be correctly extracted and summarized. However, since most travelogues are unstructured and contain much noise, it is difficult for common users to utilize such knowledge effectively. In this paper, to mine location-representative knowledge from a large collection of travelogues, we propose a probabilistic topic model, named as Location-Topic model. This model has the advantages of (1) differentiability between two kinds of topics, i.e., local topics which characterize locations and global topics which represent other common themes shared by various locations, and (2) representation of locations in the local topic space to encode both location-representative knowledge and similarities between locations. Some novel applications are developed based on the proposed model, including (1) destination recommendation for on flexible queries, (2) characteristic summarization for a given destination with representative tags and snippets, and (3) identification of informative parts of a travelogue and enriching such highlights with related images. Based on a large collection of travelogues, the proposed framework is evaluated using both objective and subjective evaluation methods and shows promising results.
- S. Ahern, M. Naaman, R. Nair, and J. Yang. World Explorer: visualizing aggregate data from unstructured text in geo-referenced collections. In Proc. JCDL, 2007. Google ScholarDigital Library
- D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. J. Mach. Learn. Res., 3:993--1022, 2003. Google ScholarDigital Library
- J. Chang, J. Boyd-Graber, and D. M. Blei. Connections between the lines: augmenting social networks with text. In Proc. KDD, 2009. Google ScholarDigital Library
- C. Chemudugunta, P. Smyth, and M. Steyvers. Modeling general and specific aspects of documents with a probabilistic topic model. In Proc. NIPS, 2006.Google Scholar
- D. Crandall, L. Backstrom, D. Huttenlocher, and J. Kleinberg. Mapping the World's Photos. In Proc. WWW, 2009. Google ScholarDigital Library
- Flickr. http://www.flickr.com/Google Scholar
- T. Griffiths and M. Steyvers. Finding scientific topics. In PNAS, 101:5228--5235, 2004.Google ScholarCross Ref
- Q. Hao, R. Cai, X.-J. Wang, J.-M. Yang, Y. Pang, and L. Zhang. Generating location overviews with images and tags by mining user-generated travelogues. In Proc. ACM Multi-media, 2009. Google ScholarDigital Library
- K. Jarvelin and J. Kekalainen. IR evaluation methods for retrieving highly relevant documents. In Proc. SIGIR, 2000. Google ScholarDigital Library
- F. Jing, L. Zhang, and W.-Y. Ma. VirtualTour: an online travel assistant based on high quality images. In Proc. ACM Multimedia, 2006. Google ScholarDigital Library
- L. Kennedy and M. Naaman. Generating diverse and representative image search results for landmarks. In Proc. WWW, 2008. Google ScholarDigital Library
- J. Kim, H. Kim, and J. Ryu. TripTip: a trip planning service with tag-based recommendation. In Proc. CHI, 2009. Google ScholarDigital Library
- Q. Mei, D. Cai, D. Zhang, and C. Zhai. Topic modeling with network regularization. In Proc. WWW, 2008. Google ScholarDigital Library
- Q. Mei, C. Liu, H. Su, and C. Zhai. A probabilistic approach to spatiotemporal theme pattern mining on weblogs. In Proc. WWW, 2006. Google ScholarDigital Library
- E. Moxley, J. Kleban, and B. S. Manjunath. SpiritTagger: a geo-aware tag suggestion tool mined from Flickr. In Proc. MIR, 2008. Google ScholarDigital Library
- D. Newman, C. Chemudugunta, P. Smyth, and M. Steyvers. Statistical entity-topic models. In Proceedings of KDD, 2006. Google ScholarDigital Library
- M. Rosen-Zvi, T. Griffiths, M. Steyvers, and P. Smyth. The author-topic model for authors and documents. In Proc. UAI, 2004. Google ScholarDigital Library
- I. Simon, N. Snavely, and S. M. Seitz. Scene summarization for online image collections. In Proc. ICCV, 2007.Google ScholarCross Ref
- I. Titov and R. McDonald. Modeling online reviews with multi-grain topic models. In Proc. WWW, 2008. Google ScholarDigital Library
- C. Wang, J. Wang, X. Xie, W.-Y. Ma. Mining geographic knowledge using location aware topic model. In Proc. GIR, 2007. Google ScholarDigital Library
- X. Wu, J. Li, Y. Zhang, S. Tang, and S.-Y. Neo. Personalized multimedia web summarizer for tourist. In Proc. WWW, 2008. Google ScholarDigital Library
- Y.-T. Zheng, M. Zhao, Y. Song, H. Adam, U. Buddemeier, A. Bissacco, F. Brucher, T.-S. Chua, and H. Neven. Tour the world: building a web-scale landmark recognition engine. In Proc. CVPR, 2009.Google ScholarCross Ref
Index Terms
- Equip tourists with knowledge mined from travelogues
Recommendations
Generating location overviews with images and tags by mining user-generated travelogues
MM '09: Proceedings of the 17th ACM international conference on MultimediaAutomatically generating location overviews in the form of both visual and textual descriptions is highly desired for online services such as travel planning, to provide attractive and comprehensive outlines of travel destinations. Actually, user-...
Summarizing tourist destinations by mining user-generated travelogues and photos
Automatically summarizing tourist destinations with both textual and visual descriptions is highly desired for online services such as travel planning, to facilitate users to understand the local characteristics of tourist destinations. Travelers are ...
Topic modeling methods for short texts: A survey
In the present day, online users are incentivized to engage in short text-based communication. These short texts harbor a significant amount of implicit information, including opinions, topics, and emotions, which are of notable value for both ...
Comments