ABSTRACT
Retrospective news event detection (RED) is defined as the discovery of previously unidentified events in historical news corpus. Although both the contents and time information of news articles are helpful to RED, most researches focus on the utilization of the contents of news articles. Few research works have been carried out on finding better usages of time information. In this paper, we do some explorations on both directions based on the following two characteristics of news articles. On the one hand, news articles are always aroused by events; on the other hand, similar articles reporting the same event often redundantly appear on many news sources. The former hints a generative model of news articles, and the latter provides data enriched environments to perform RED. With consideration of these characteristics, we propose a probabilistic model to incorporate both content and time information in a unified framework. This model gives new representations of both news articles and news events. Furthermore, based on this approach, we build an interactive RED system, HISCOVERY, which provides additional functions to present events, Photo Story and Chronicle.
- Topic detection and tracking(tdt) project. homepage: http://www.nist.gov/speech/tests/tdt/.Google Scholar
- J. Allan, H. Jin, M. Rajman, C. Wayne, G. D., L. V., R. Hoberman, and D. Caputo. Summer workshop final report. In Center for Language and Speech Processing, 1999.Google Scholar
- J. Allan, R. Papka, and V. Lavrenko. On-line new event detection and tracking. In Proc. of SIGIR Conference on Research and Development in Information Retrieval, 1998. Google ScholarDigital Library
- D. M. Bikel, R. L. Schwartz, and R. M. Weischedel. An algorithm that learns what's in a name. Machine Learning, 1999. Google ScholarDigital Library
- T. Brants, F. Chen, and A. Farahat. A system for new event detection. In Proc. of the SIGIR conference on Research and development in information retrieval, 2003. Google ScholarDigital Library
- T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning: Data Mining, Inference and Prediction. Springer-Verlag, 2001.Google Scholar
- G. Kumaran and J. Allan. Text classification and named entities for new event detection. In Proc. of the SIGIR Conference on Research and Development in Information Retrieval, 2004. Google ScholarDigital Library
- W. Lam, H. Meng, K. Wong, and J. Yen. Using contextual analysis for news event detection. International Journal on Intelligent Systems, 2001.Google Scholar
- K. Nigam, A. McCallum, S. Thrun, and T. Mitchell. Text classification from labeled and unlabeled documents using em. Machine Learning, 2000. Google ScholarDigital Library
- A. Strehl, J. Ghosh, and R. Mooney. Impact of the similarity measures on web-page clustering. In Proc. of the AAAI 2000 Workshop on AI for Web Search, 2000.Google Scholar
- Y. Yang and J. Z. et al. Topic-conditioned novelty detection. In Proc. of the SIGKDD international conference on Knowledge discovery and data mining, 2002. Google ScholarDigital Library
- Y. Yang, T. Pierce, and J. G. Carbonell. A study on retrospective and on-line event detection. In Proc. of the SIGIR Conference on Research and Development in Information Retrieval, 1998. Google ScholarDigital Library
- Y. Zhang, J. Callan, and T. Minka. Novelty and redundancy detection in adaptive filtering. In Proc. of the SIGIR Conference on Research and Development in Information Retrieval, 2002. Google ScholarDigital Library
Index Terms
- A probabilistic model for retrospective news event detection
Recommendations
A Two-layer Text Clustering Approach for Retrospective News Event Detection
AICI '10: Proceedings of the 2010 International Conference on Artificial Intelligence and Computational Intelligence - Volume 01For retrospective news event detection (RED), the widely used agglomerative hierarchical clustering (AHC) has a shortcoming that news stories belong to different news events are probably clustered together if they share enough common words. The reason ...
Unified approach to retrospective event detection for event- based epidemic intelligence
AbstractInferring the magnitude and occurrence of real-world events from natural language text is a crucial task in various domains. Particularly in the domain of public health, the state-of-the-art document and token centric event detection approaches ...
Adapting the influences of publishers to perform news event detection
Online news outlets have the power to influence public policy issues. To understand the opinions of the people, many government departments check online news outlets to manually detect events that interest people. This process is time-consuming. To ...
Comments