ABSTRACT
Detecting local events (e.g., protest, disaster) at their onsets is an important task for a wide spectrum of applications, ranging from disaster control to crime monitoring and place recommendation. Recent years have witnessed growing interest in leveraging geo-tagged tweet streams for online local event detection. Nevertheless, the accuracies of existing methods still remain unsatisfactory for building reliable local event detection systems. We propose TrioVecEvent, a method that leverages multimodal embeddings to achieve accurate online local event detection. The effectiveness of TrioVecEvent is underpinned by its two-step detection scheme. First, it ensures a high coverage of the underlying local events by dividing the tweets in the query window into coherent geo-topic clusters. To generate quality geo-topic clusters, we capture short-text semantics by learning multimodal embeddings of the location, time, and text, and then perform online clustering with a novel Bayesian mixture model. Second, TrioVecEvent considers the geo-topic clusters as candidate events and extracts a set of features for classifying the candidates. Leveraging the multimodal embeddings as background knowledge, we introduce discriminative features that can well characterize local events, which enables pinpointing true local events from the candidate pool with a small amount of training data. We have used crowdsourcing to evaluate TrioVecEvent, and found that it improves the performance of the state-of-the-art method by a large margin.
Supplemental Material
- H. Abdelhaq, C. Sengstock, and M. Gertz. Eventweet: Online localized event detection from twitter. PVLDB, 6(12):1326--1329, 2013. Google ScholarDigital Library
- C. C. Aggarwal, J. Han, J. Wang, and P. S. Yu. A framework for clustering evolving data streams. In VLDB, pages 81--92, 2003. Google ScholarCross Ref
- C. C. Aggarwal and K. Subbian. Event detection in social streams. In SDM, pages 624--635, 2012. Google ScholarCross Ref
- J. Allan, R. Papka, and V. Lavrenko. On-line new event detection and tracking. In SIGIR, pages 37--45, 1998. Google ScholarDigital Library
- K. Batmanghelich, A. Saeedi, K. Narasimhan, and S. Gershman. Nonparametric spherical topic modeling with word embeddings. In ACL, 2016. Google ScholarCross Ref
- D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. Journal of Machine Learning Research, 3(1):993--1022, 2003.Google Scholar
- L. Cao, M. Wei, D. Yang, and E. A. Rundensteiner. Online outlier exploration over large datasets. In KDD, pages 89--98, 2015. Google ScholarDigital Library
- L. Chen and A. Roy. Event detection from flickr data through wavelet-based spatial analysis. In CIKM, pages 523--532, 2009. Google ScholarDigital Library
- J. Cranshaw, E. Toch, J. I. Hong, A. Kittur, and N. M. Sadeh. Bridging the gap between physical location and online social networks. In UbiComp, pages 119--128, 2010. Google ScholarDigital Library
- S. Doan, B.-K. H. Vo, and N. Collier. An analysis of twitter messages in the 2011 tohoku earthquake. In Electronic Healthcare, pages 58--66. Springer, 2012. Google ScholarCross Ref
- W. Feng, C. Zhang, W. Zhang, J. Han, J. Wang, C. Aggarwal, and J. Huang. Streamcube: Hierarchical spatio-temporal hashtag clustering for event exploration over the twitter stream. In ICDE, pages 1561--1572, 2015.Google ScholarCross Ref
- J. Foley, M. Bendersky, and V. Josifovski. Learning to extract local events from the web. In SIGIR, pages 423--432, 2015. Google ScholarDigital Library
- G. P. C. Fung, J. X. Yu, P. S. Yu, and H. Lu. Parameter free bursty events detection in text streams. In VLDB, pages 181--192, 2005.Google ScholarDigital Library
- P. Giridhar, S. Wang, T. F. Abdelzaher, J. George, L. Kaplan, and R. Ganti. Joint localization of events and sources in social networks. In DCOSS, pages 179--188, 2015. Google ScholarDigital Library
- S. Gopal and Y. Yang. Von mises-fisher clustering models. In ICML, pages 154--162, 2014.Google Scholar
- J. Guo and Z. Gong. A nonparametric model for event discovery in the geospatial-temporal space. In CIKM, pages 499--508, 2016. Google ScholarDigital Library
- Q. He, K. Chang, and E.-P. Lim. Analyzing feature trajectories for event detection. In SIGIR, pages 207--214, 2007. Google ScholarDigital Library
- X. He, H. Zhang, M. Kan, and T. Chua. Fast matrix factorization for online recommendation with implicit feedback. In SIGIR, pages 549--558, 2016. Google ScholarDigital Library
- L. Hong, A. Ahmed, S. Gurumurthy, A. J. Smola, and K. Tsioutsiouliklis. Discovering geographical topics in the twitter stream. In WWW, pages 769--778, 2012. Google ScholarDigital Library
- W. Kang, A. K. H. Tung, W. Chen, X. Li, Q. Song, C. Zhang, F. Zhao, and X. Zhou. Trendspedia: An internet observatory for analyzing and visualizing the evolving web. In ICDE, pages 1206--1209, 2014.Google ScholarCross Ref
- C. C. Kling, J. Kunegis, S. Sizov, and S. Staab. Detecting non-gaussian geographical topics in tagged photo collections. In WSDM, pages 603--612, 2014. Google ScholarDigital Library
- J. Krumm and E. Horvitz. Eyewitness: Identifying local events via space-time signals in twitter feeds. In SIGSPATIAL, 2015.Google ScholarDigital Library
- C. Li, A. Sun, and A. Datta. Twevent: segment-based event detection from tweets. In CIKM, pages 155--164, 2012. Google ScholarDigital Library
- R. Li, K. H. Lei, R. Khadiwala, and K.-C. Chang. Tedas: A twitter-based event detection and analysis system. In ICDE, pages 1273--1276, 2012. Google ScholarDigital Library
- S. Liang, E. Yilmaz, and E. Kanoulas. Dynamic clustering of streaming short documents. In KDD, pages 995--1004, 2016. Google ScholarDigital Library
- M. Mathioudakis and N. Koudas. Twittermonitor: trend detection over the twitter stream. In SIGMOD, pages 1155--1158, 2010. Google ScholarDigital Library
- T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean. Distributed representations of words and phrases and their compositionality. In NIPS, pages 3111--3119, 2013.Google ScholarDigital Library
- K. P. Murphy. Machine learning: a probabilistic perspective. MIT press, 2012.Google ScholarDigital Library
- G. Nunez-Antonio and E. Gutiérrez-Pena. A bayesian analysis of directional data using the von mises--fisher distribution. Communications in Statistics-Simulation and Computation®, 34(4):989--999, 2005. Google ScholarCross Ref
- M. Quezada, V. Pe na-Araya, and B. Poblete. Location-aware model for news events in social media. In SIGIR, pages 935--938, 2015.Google ScholarDigital Library
- A. Ritter, S. Clark, Mausam, and O. Etzioni. Named entity recognition in tweets: An experimental study. In EMNLP, pages 1524--1534, 2011.Google ScholarDigital Library
- T. Sakaki, M. Okazaki, and Y. Matsuo. Earthquake shakes twitter users: real-time event detection by social sensors. In WWW, pages 851--860, 2010. Google ScholarDigital Library
- J. Sankaranarayanan, H. Samet, B. E. Teitler, M. D. Lieberman, and J. Sperling. Twitterstand: news in tweets. In GIS, pages 42--51, 2009. Google ScholarDigital Library
- S. Sizov. Geofolk: latent spatial semantics in web 2.0 social media. In WSDM, pages 281--290, 2010.Google ScholarDigital Library
- W. Wang, H. Yin, L. Chen, Y. Sun, S. W. Sadiq, and X. Zhou. Geo-sage: A geographical sparse additive generative model for spatial item recommendation. In KDD, pages 1255--1264, 2015. Google ScholarDigital Library
- K. Watanabe, M. Ochi, M. Okabe, and R. Onai. Jasmine: a real-time local-event detection system based on geolocation information propagated to microblogs. In CIKM, pages 2541--2544, 2011. Google ScholarDigital Library
- J. Weng and B.-S. Lee. Event detection in twitter. In ICWSM, pages 401--408, 2011.Google Scholar
- S. Yao, S. Hu, Y. Zhao, A. Zhang, and T. F. Abdelzaher. Deepsense: A unified deep learning framework for time-series mobile sensing data processing. In WWW, pages 351--360, 2017. Google ScholarDigital Library
- J. Yin and J. Wang. A text clustering algorithm using an online clustering scheme for initialization. In KDD, pages 1995--2004, 2016. Google ScholarDigital Library
- Z. Yin, L. Cao, J. Han, C. Zhai, and T. S. Huang. Geographical topic discovery and comparison. In WWW, pages 247--256, 2011. Google ScholarDigital Library
- Q. Yuan, G. Cong, Z. Ma, A. Sun, and N. M. Thalmann. Who, where, when and what: discover spatio-temporal topics for twitter users. In KDD, pages 605--613, 2013. Google ScholarDigital Library
- C. Zhang, K. Zhang, Q. Yuan, H. Peng, Y. Zheng, T. Hanratty, S. Wang, and J. Han. Regions, periods, activities: Uncovering urban dynamics via cross-modal representation learning. In WWW, pages 361--370, 2017.Google Scholar
- C. Zhang, K. Zhang, Q. Yuan, L. Zhang, T. Hanratty, and J. Han. Gmove: Group-level mobility modeling using geo-tagged social media. In KDD, pages 1305--1314, 2016.Google ScholarDigital Library
- C. Zhang, G. Zhou, Q. Yuan, H. Zhuang, Y. Zheng, L. Kaplan, S. Wang, and J. Han. Geoburst: Real-time local event detection in geo-tagged tweet streams. In SIGIR, pages 513--522, 2016.Google ScholarDigital Library
- L. Zhao, F. Chen, C.-T. Lu, and N. Ramakrishnan. Multi-resolution spatial event forecasting in social media. In KDD, 2016. Google ScholarCross Ref
- L. Zhao, Q. Sun, J. Ye, F. Chen, C. Lu, and N. Ramakrishnan. Multi-task learning for spatio-temporal event forecasting. In KDD, pages 1503--1512, 2015. Google ScholarDigital Library
- L. Zhao, J. Ye, F. Chen, C. Lu, and N. Ramakrishnan. Hierarchical incomplete multi-source feature learning for spatiotemporal event forecasting. In KDD, pages 2085--2094, 2016. Google ScholarDigital Library
- S. Zhao, T. Zhao, I. King, and M. R. Lyu. Geo-teaser: Geo-temporal sequential embedding rank for point-of-interest recommendation. In WWW, pages 153--162, 2017.Google Scholar
Index Terms
- TrioVecEvent: Embedding-Based Online Local Event Detection in Geo-Tagged Tweet Streams
Recommendations
GeoBurst+: Effective and Real-Time Local Event Detection in Geo-Tagged Tweet Streams
Regular Papers and Special Issue: Urban IntelligenceThe real-time discovery of local events (e.g., protests, disasters) has been widely recognized as a fundamental socioeconomic task. Recent studies have demonstrated that the geo-tagged tweet stream serves as an unprecedentedly valuable source for local ...
GeoBurst: Real-Time Local Event Detection in Geo-Tagged Tweet Streams
SIGIR '16: Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information RetrievalThe real-time discovery of local events (e.g., protests, crimes, disasters) is of great importance to various applications, such as crime monitoring, disaster alarming, and activity recommendation. While this task was nearly impossible years ago due to ...
SensorTree: Bursty Propagation Trees as Sensors for Protest Event Detection
Web Information Systems Engineering – WISE 2018AbstractProtest event detection is an important task with numerous benefits to many organisations, emergency services, and other stakeholders. Existing research has presented myriad approaches relying on tweet corpus to solve the event detection problem, ...
Comments