ABSTRACT
Twitter is a pervasive technology, with hundreds of millions of users serving as sensors that provide eyewitness accounts of events on the ground. In case of popular events, these sensors start to broadcast news by tweeting to their followers, and to the world. Within minutes these tweets can attract attention and also serve as a primary information source for traditional media. Given a huge set of tweets, the key questions are: (1) How can we detect informative events in general? (2) How can we distinguish relevant events from others? In this paper we tackle these challenges with a statistical model for detecting events by spotting significant frequency deviations of the words' frequency over time. Besides single word events, our model also accounts for events composed of multiple co-occurring words, thus, providing much richer information. Our statistical process is complemented with an optimization algorithm to extract only non-redundant events, overall, providing the user with a succinct summary of the current events. We used our model to analyze 24 million geotagged tweets that have been sent in the US from April 9 to April 22, 2013 -- the time period of the Boston marathon bombing -- and we show that our approach can create multi-word events that efficiently summarize real-world events.
- R. Agrawal and R. Srikant. Fast algorithms for mining association rules in large databases. In VLDB, pages 487--499, 1994. Google ScholarDigital Library
- J. Benhardus and J. Kalita. Streaming trend detection in twitter. International Journal of Web Based Communities, 9(1):122--139, 2013. Google ScholarDigital Library
- C. M. Bishop. Pattern recognition and machine learning. Springer, 2006. Google ScholarDigital Library
- C. Castillo, M. Mendoza, and B. Poblete. Information credibility on twitter. In WWW, pages 675--684. ACM, 2011. Google ScholarDigital Library
- M. Cataldi, L. D. Caro, and C. Schifanella. Personalized emerging topic detection based on a term aging model. ACM TIST, 5(1):7, 2013. Google ScholarDigital Library
- Y. Chung, S. Rabe-Hesketh, V. Dorie, A. Gelman, and J. Liu. A nondegenerate penalized likelihood estimator for variance parameters in multilevel models. Psychometrika, 78(4):685--709, 2013.Google ScholarCross Ref
- K. S. Jones. Readings in information retrieval. Morgan Kaufmann, 1997.Google Scholar
- C. Li, A. Sun, and A. Datta. Twevent: segment-based event detection from tweets. In CIKM, pages 155--164. ACM, 2012. Google ScholarDigital Library
- M. Mendoza, B. Poblete, and C. Castillo. Twitter under crisis: Can we trust what we rt? In Workshop on Social Media Analytics, pages 71--79. ACM, 2010. Google ScholarDigital Library
- D. Metzler and W. B. Croft. A markov random field model for term dependencies. In ACM SIGIR, pages 472--479, 2005. Google ScholarDigital Library
- K. P. Murphy. Conjugate bayesian analysis of the gaussian distribution. Technical report, University of British Columbia, 2007.Google Scholar
- M. Naaman, J. Boase, and C.-H. Lai. Is it really about me?: message content in social awareness streams. In ACM conference on Computer supported cooperative work, pages 189--192. ACM, 2010. Google ScholarDigital Library
- T. Rattenbury, N. Good, and M. Naaman. Towards automatic extraction of event and place semantics from flickr tags. In SIGIR, pages 103--110, 2007. Google ScholarDigital Library
- A. Ritter, Mausam, O. Etzioni, and S. Clark. Open domain event extraction from twitter. In SIGKDD, pages 1104--1112, 2012. Google ScholarDigital Library
- T. Sakaki, M. Okazaki, and Y. Matsuo. Earthquake shakes twitter users: real-time event detection by social sensors. In WWW, pages 851--860. ACM, 2010. Google ScholarDigital Library
- J. Sankaranarayanan, H. Samet, B. E. Teitler, M. D. Lieberman, and J. Sperling. Twitterstand: news in tweets. In SIGSPATIAL/GIS, pages 42--51. ACM, 2009. Google ScholarDigital Library
- H. Tsukayama. Twitter turns 7: Users send over 400 million tweets per day, March 2013.Google Scholar
- Finding Non-Redundant Multi-Word Events on Twitter
Recommendations
Identification of live news events using Twitter
LBSN '11: Proceedings of the 3rd ACM SIGSPATIAL International Workshop on Location-Based Social NetworksTwitter presents a source of information that cannot easily be obtained anywhere else. However, though many posts on Twitter reveal up-to-the-minute information about events in the world or interesting sentiments, far more posts are of no interest to ...
Finding news-topic oriented influential twitter users based on topic related hashtag community detection
Recently, more and more users would like to collect and provide information about news topics in Twitter, which is one of the most popular microblogging services. Virtual communities defined by hashtags in Twitter are created for exchanging information ...
Information resonance on Twitter: watching Iran
SOMA '10: Proceedings of the First Workshop on Social Media AnalyticsTwitter has undoubtedly caught the attention of both the general public, and academia as a microblogging service worthy of study and attention. Twitter has several features that sets it apart from other social media/networking sites, including its 140 ...
Comments