Abstract
Social networks, such as Twitter, can quickly and broadly disseminate news and memes across both real-world events and cultural trends. Such networks are often the best sources of up-to-the-minute information, and are therefore of considerable commercial and consumer interest. The trending topics that appear first on these networks represent an answer to the age-old query “what are people talking about?” Given the incredible volume of posts (on the order of 45,000 or more per minute), and the vast number of stories about which users are posting at any given time, it is a formidable problem to extract trending stories in real time. In this article, we describe a method and implementation for extracting trending topics from a high-velocity real-time stream of microblog posts. We describe our approach and implementation, and a set of experimental results that show that our system can accurately find “hot” stories from high-rate Twitter-scale text streams.
- Agrawal, R. and Srikant, R. 1994. Fast algorithms for mining association rules. In Proceedings of the 20th Very Large Database Conference. 487--499. Google ScholarDigital Library
- Asur, S. and Huberman, B. A. 2010. Predicting the future with social media. In Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT). 492--499. Google ScholarDigital Library
- Benhardus, J. 2010. Streaming trend detection in Twitter. UCCS REU For Artificial Intelligence, Natural Language Processing And Information Retrieval Final Report, 1--7.Google Scholar
- Bollen, J., Mao, H., and Pepe, A. 2011. Modeling public mood and emotion: Twitter sentiment and socio-economic phenomena. In Proceedings of the International Conference on Weblogs and Social Media.Google Scholar
- Broadwell, P. M. 2004. Response time as a performability metric for online services. Tech. rep. UCB/CSD-04-1324, EECS Department, University of California, Berkeley.Google Scholar
- Cataldi, M., Di Caro, L., and Schifanella, C. 2010. Emerging topic detection on Twitter based on temporal and social terms evaluation. In Proceedings of the 10th International Workshop on Multimedia Data Mining. 1--10. Google ScholarDigital Library
- Chang, J. H. and Lee, W. S. 2003. Finding recent frequent itemsets adaptively over online data streams. In Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD). 487--492. Google ScholarDigital Library
- Chang, J. H. and Lee, W. S. 2004. A sliding window method for finding recently frequent itemsets over online data streams. J. Inform. Sci. Eng. 20, 4, 753--762.Google Scholar
- Chi, Y., Wang, H., Yu, P. S., and Muntz, R. R. 2004. Moment: Maintaining closed frequent itemsets over a stream sliding window. In Proceedings of the IEEE International Conference on Data Mining (CDM). 59--66. Google ScholarDigital Library
- Forrester. 2012. Forrester ecommerce study: The two-second rule is critical. http://colderice.com/forrester-ecommerce-study-the-2-second-rule-is-critical/.Google Scholar
- Giannella, C., Han, J., Pei, J., Yan, X., and Yu, P. S. 2004. Mining frequent patterns in data stream at multiple time granularities. In Next Generation Data Mining, H. Kargupta, A. Joshi, K. Sivakumar, and Y. Yesha Eds., AAAI/MIT, 191--212.Google Scholar
- Glance, N., Hurst, M., and Tomokiyo, T. 2004. BlogPulse: Automated trend discovery for weblogs. In Proceedings of the WWW Workshop on the Weblogging Ecosystem: Aggregation, Analysis and Dynamics.Google Scholar
- Hotho, A., Jaschke, R., Schmitz, C., and Stumme, G. 2006. Trend detection in folksonomies. Semantic Multimedia, 56--70. Google ScholarDigital Library
- Johnson, S. 2009. How Twitter will change the way we live. http://www.time.com/time/magazine/article/0,9171,1902818,00.html.Google Scholar
- Kannan, A., Patzer, J., and Avital, B. 2010. Trendtracker: Trending topics on Twitter. http://vis.berkeley.edu/courses/cs294-10-sp10/wiki/images/d/d4/FinalPaper.pdf.Google Scholar
- Khader, P., Scherag, A., Streb, J., and Roumlsler, F. 2003. Differences between noun and verb processing in a minimal phrase context: A semantic priming study using event-related brain potentials. Cog. Brain Res. 17, 2, 293--313.Google ScholarCross Ref
- Kwak, H., Lee, C., Park, H., and Moon, S. 2010. What is twitter, a social network or a news media? In Proceedings of the 19th International Conference on the World Wide Web. 591--600. Google ScholarDigital Library
- Li, H.-F. and Lee, S.-Y. 2009. Mining frequent itemsets over data streams using efficient window sliding techniques. Expert Syst. Appl. 36, 2, 1466--1477. Google ScholarDigital Library
- Li, H.-F., Ho, C.-C., and Lee, S.-Y. 2009. Incremental updates of closed frequent itemsets over continuous data streams. Expert Syst. Appl. 36, 2, 2451--2458. Google ScholarDigital Library
- Liang, X., Chen, W., and Bu, J. 2010. Bursty feature based topic detection and summarization. In Proceedings of the 2nd International Conference on Computer Engineering and Technology.Google Scholar
- Manku, G. S. and Motwani, R. 2002. Approximate frequency counts over data streams. In Proceedings of the 28th Very Large Data Base Conference (VLDB). 346--357. Google ScholarDigital Library
- Mathioudakis, M. and Koudas, N. 2010. TwitterMonitor: Trend detection over the Twitter stream. In Proceedings of the International Conference on Management of Data. 1155--1158. Google ScholarDigital Library
- Popescu, A.-M. and Pennacchiott, M. 2011. Dancing with the stars, NBA games, politics: An exploration of Twitter users’ response to events. In Proceedings of the 5th International AAAI Conference on Weblogs and Social Media. 594--597.Google Scholar
- Rui, H. and Whinston, A. 2012. Designing a social-broadcasting-based business intelligence system. ACM Trans. Manage. Inf. Syst. 2, 4, 1--19. Google ScholarDigital Library
- Twitter. Twitter posts. http://blog.twitter.com/2011/03/numbers.html.Google Scholar
- Twitter, Inc. 2011. Year in review: Tweets per second. http://yearinreview.twitter.com/en/tps.html.Google Scholar
- Twitter, Inc. 2012. Twitter turns six. http://blog.twitter.com/2012/03/twitter-turns-six.html.Google Scholar
- Zhu, Y. and Shasha, D. 2002. Statstream: Statistical monitoring of thousands of data streams in real time. In Proceedings of the 28th Very Large Data Base Conference. 358--369. Google ScholarDigital Library
Index Terms
- Fast, Scalable, and Context-Sensitive Detection of Trending Topics in Microblog Post Streams
Recommendations
Detection of Trending Topic Communities: Bridging Content Creators and Distributors
HT '17: Proceedings of the 28th ACM Conference on Hypertext and Social MediaThe rise of a trending topic on Twitter or Facebook leads to the temporal emergence of a set of users currently interested in that topic. Given the temporary nature of the links between these users, being able to dynamically identify communities of ...
Classifying trending topics: a typology of conversation triggers on Twitter
CIKM '11: Proceedings of the 20th ACM international conference on Information and knowledge managementTwitter summarizes the great deal of messages posted by users in the form of trending topics that reflect the top conversations being discussed at a given moment. These trending topics tend to be connected to current affairs. Different happenings can ...
Behavior Analysis of Microblog Users Based on Transitions in Posting Activities
IIWAS '13: Proceedings of International Conference on Information Integration and Web-based Applications & ServicesIn recent years, such microblogs as Twitter have spread widely over the world. Twitter, which enables instant text communications among users, was launched in 2006. In 2012, its Japanese users exceeded 29.9 million. Useful functions related to posting a ...
Comments