ABSTRACT
Recent developments in sensors, GPS and smart phones have provided us with a large amount of mobility data. At the same time, large-scale crowd-generated social media data, such as geo-tagged tweets, provide rich semantic information about locations and events. Combining the mobility data and surrounding social media data enables us to semantically understand why a person travels to a location at a particular time (e.g., attending a local event or visiting a point of interest). Previous research on mobility data mining has been mainly focused on mining patterns using only the mobility data. In this paper, we study the problem of using social media to annotate mobility data. As social media data is often noisy, the key research problem lies in using the right model to retrieve only the relevant words with respect to a mobility record. We propose frequency-based method, Gaussian mixture model, and kernel density estimation (KDE) to tackle this problem. We show that KDE is the most suitable model as it captures the locality of word distribution very well. We test our proposal using the real dataset collected from Twitter and demonstrate the effectiveness of our techniques via both interesting case studies and a comprehensive evaluation.
- L. O. Alvares, V. Bogorny, B. Kuijpers, J. A. F. de Macedo, B. Moelans, and A. Vaisman. A model for enriching trajectories with semantic geographical information. In Proc. ACM GIS, 2007. Google ScholarDigital Library
- N. Andrienko and G. Andrienko. Designing visual analytics methods for massive collections of movementdata. Cartographica: The International Journal for Geographic Information and Geovisualization, 2007.Google Scholar
- D. Ashbrook and T. Starner. Using gps to learn significant locations and predict movement across multiple users. UbiComp, 2003.Google ScholarDigital Library
- L. Backstrom, J. Kleinberg, R. Kumar, and J. Novak. Spatial variation in search engine queries. In Proc. WWW, 2008. Google ScholarDigital Library
- J. Bithell. An application of density estimation to geographical epidemiology. Statistics in medicine, 9(6):691--701, 1990.Google ScholarCross Ref
- L. Breiman, W. Meisel, and E. Purcell. Variable kernel estimates of multivariate densities. Technometrics, 1977.Google ScholarCross Ref
- X. Cao, G. Cong, and C. S. Jensen. Mining significant semantic locations from gps data. Proc. VLDB, 2010. Google ScholarDigital Library
- D. Chakrabarti and K. Punera. Event summarization using tweets. ICWSM, 11:66--73, 2011.Google Scholar
- Z. Cheng, J. Caverlee, and K. Lee. You are where you tweet: a content-based approach to geo-locating twitter users. In Proc. ACM CIKM, 2010. Google ScholarDigital Library
- E. Cho, S. A. Myers, and J. Leskovec. Friendship and mobility: user movement in location-based social networks. In Proc. ACM KDD, 2011. Google ScholarDigital Library
- K. Dehnad. Density estimation for statistics and data analysis. Technometrics, 29(4):495--495, 1987.Google ScholarCross Ref
- N. Donthu and R. T. Rust. Note-estimating geographic customer densities using kernel density estimation. Marketing Science, 8(2):191--203, 1989.Google ScholarDigital Library
- N. Eagle, A. Pentland, and D. Lazer. Inferring friendship network structure by using mobile phone data. In Proc. PNAS, 2009.Google ScholarCross Ref
- G. Erkan and D. R. Radev. Lexrank: graph-based lexical centrality as salience in text summarization. Journal of Artificial Intelligence Research, 2004. Google ScholarDigital Library
- B. Guc, M. May, Y. Saygin, and C. Körner. Semantic annotation of gps trajectories. In Proc AGILE, 2008.Google Scholar
- S. Hasan, X. Zhan, and S. V. Ukkusuri. Understanding urban human activity and mobility patterns using large-scale location-based data from online social media. In UrbComp, 2013. Google ScholarDigital Library
- D. Inouye and J. K. Kalita. Comparing twitter summarization algorithms for multiple post summaries. In PASSAT and SocialCom. IEEE, 2011.Google Scholar
- Z. Li, B. Ding, J. Han, R. Kays, and P. Nye. Mining periodic behaviors for moving objects. In Proc. ACM KDD, 2010. Google ScholarDigital Library
- L. Liao. Location-based activity recognition. PhD thesis, University of Washington, 2006. Google ScholarDigital Library
- M. Lichman and P. Smyth. Modeling human location data with mixtures of kernel densities. In Proc. SIGKDD. ACM, 2014. Google ScholarDigital Library
- N. Mamoulis, H. Cao, G. Kollios, M. Hadjieleftheriou, Y. Tao, and D. Cheung. Mining, indexing, and querying historical spatiotemporal data. In Proc. ACM KDD, 2004. Google ScholarDigital Library
- M. Mathioudakis, N. Bansal, and N. Koudas. Identifying, attributing and describing spatial bursts. In Proc. VLDB, 2010. Google ScholarDigital Library
- R. Mihalcea and P. Tarau. Textrank: Bringing order into texts. ACL, 2004.Google Scholar
- A. T. Palma, V. Bogorny, B. Kuijpers, and L. O. Alvares. A clustering-based approach for discovering interesting places in trajectories. In Proc. SAC, 2008. Google ScholarDigital Library
- D. Radev, T. Allison, S. Blair-Goldensohn, J. Blitzer, A. Celebi, S. Dimitrov, E. Drabek, A. Hakim, W. Lam, D. Liu, et al. Mead-a platform for multidocument multilingual text summarization. 2004.Google Scholar
- B. Sharifi, M.-A. Hutton, and J. Kalita. Summarizing microblogs automatically. In Proc NAACL, 2010. Google ScholarDigital Library
- S. Spaccapietra, C. Parent, M. L. Damiani, J. A. de Macedo, F. Porto, and C. Vangenot. A conceptual view on trajectories. Trans. IEEE TKDE, 2008. Google ScholarDigital Library
- A. Strehl, J. Ghosh, and R. Mooney. Impact of similarity measures on web-page clustering. In AAAI Workshop for Web Search, 2000.Google Scholar
- H. Takamura, H. Yokono, and M. Okumura. Summarizing a document stream. In Advances in Information Retrieval, pages 177--188. Springer, 2011. Google ScholarDigital Library
- L. Vanderwende, H. Suzuki, C. Brockett, and A. Nenkova. Beyond sumbasic: Task-focused summarization with sentence simplification and lexical expansion. Information Processing & Management,2007. Google ScholarDigital Library
- K. Xie, K. Deng, and X. Zhou. From trajectories to activities: a spatio-temporal join approach. In Proc. LBSN, 2009. Google ScholarDigital Library
- Z. Yan, D. Chakraborty, C. Parent, S. Spaccapietra, and K. Aberer. Semitri: a framework for semantic annotation of heterogeneous trajectories. In Proc. EDBT, 2011. Google ScholarDigital Library
- Z. Yan, D. Chakraborty, C. Parent, S. Spaccapietra, and K. Aberer. Semantic trajectories: Mobility data computation and annotation. ACM Trans. TIST, 4(3):49, 2013. Google ScholarDigital Library
- Z. Yan, N. Giatrakos, V. Katsikaros, N. Pelekis, and Y. Theodoridis. Setrastream: semantic-aware trajectory construction over streaming movement data. In SSTD. Springer, 2011. Google ScholarDigital Library
- Z. Yin, L. Cao, J. Han, C. Zhai, and T. Huang. Geographical topic discovery and comparison. In Proc. WWW, 2011. Google ScholarDigital Library
- J.-D. Zhang and C.-Y. Chow. igslr: personalized geo-social location recommendation: a kernel density estimation approach. In Proc. SIGSPATIAL. ACM, 2013. Google ScholarDigital Library
- Y. Zheng, Y. Chen, Q. Li, X. Xie, and W.-Y. Ma. Understanding transportation modes based on gps data for web applications. Trans. ACM TWEB, 2010. Google ScholarDigital Library
- Y. Zheng, L. Zhang, X. Xie, and W.-Y. Ma. Mining interesting locations and travel sequences from gps trajectories. In Proc. WWW, 2009. Google ScholarDigital Library
- C. Zhou, D. Frankowski, P. Ludford, S. Shekhar, and L. Terveen. Discovering personally meaningful places: An interactive clustering approach. TOIS, 2007. Google ScholarDigital Library
Index Terms
- Semantic Annotation of Mobility Data using Social Media
Recommendations
On the use of mobility data for discovery and description of social ties
ASONAM '13: Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and MiningEver-increasing emergence of location-aware ubiquitous devices has facilitated collection of time-stamped mobility data. This large volume of data not only provides trajectory information but also information about social interaction between ...
Mining social media with social theories: a survey
The increasing popularity of social media encourages more and more users to participate in various online activities and produces data in an unprecedented rate. Social media data is big, linked, noisy, highly unstructured and in- complete, and differs ...
Social media user classification: based on social capital expectation, susceptibility, and compulsion loop
ICEC '17: Proceedings of the International Conference on Electronic CommerceSocial media such as Facebook, Instagram and Twitter are originally developed as communication tools among individuals for private conversations. Through the platforms, people share photos, stories and news with their social media friends to interact ...
Comments