ABSTRACT
The problem we tackle in this work is, given a present news event, to generate a plausible future event that can be caused by the given event. We present a new methodology for modeling and predicting such future news events using machine learning and data mining techniques. Our Pundit algorithm generalizes examples of causality pairs to infer a causality predictor. To obtain precise labeled causality examples, we mine 150 years of news articles, and apply semantic natural language modeling techniques to titles containing certain predefined causality patterns. For generalization, the model uses a vast amount of world knowledge ontologies mined from LinkedData, containing ~200 datasets with approximately 20 billion relations. Empirical evaluation on real news articles shows that our Pundit algorithm reaches a human-level performance.
- E. Agichtein and L. Gravano. Snowball: extracting relations from large plain-text collections. In Proc. of JCDL, pages 85--94, 2000. Google ScholarDigital Library
- Amr Ahmed, Qirong Ho, Jacob Eisenstein, Eric P. Xing, Alexander J. Smola, and Choon Hui Teo. Unified analysis of streaming news. In Proc. of WWW, 2011. Google ScholarDigital Library
- M. Banko and O. Etzioni. The tradeoffs between open and traditional relation extraction. In Proc. of ACL-08: HLT, 2008.Google Scholar
- C. Bizer, T. Heath, and T. Berners-Lee. Linked data -- the story so far. IJSWIS, 2009.Google ScholarCross Ref
- E. Blanco, N. Castell, and D. Moldovan. Causal Relation Extraction. In Proc. of LREC, 2008.Google Scholar
- A. Carlson, J. Betteridge, B. Kisiel, B. Settles, E.R. Hruschka Jr., and T.M. Mitchell. Toward an architecture for never-ending language learning. In Proc. of AAAI, 2010.Google ScholarDigital Library
- N. Chambers and D. Jurafsky. Template-Based Information Extraction without the Templates. In Proc. of ACL, 2011. Google ScholarDigital Library
- N. Chambers, S Wang, and D. Jurafsky. Classifying temporal relations between events. In Proc. of ACL (Poster), 2007. Google ScholarDigital Library
- K. Chan and W. Lam. Extracting causation knowledge from natural language texts. IJIS, 20:327--358, 05. Google ScholarDigital Library
- Q. Do, Y. Chan, and D. Roth. Minimally supervised event causality identification. In EMNLP, 2011. Google ScholarDigital Library
- M. B. Eisen, P. T. Spellman, P. O. Brown, and D. Botstein. Cluster analysis and display of genome-wide expression patterns.PNAS, 95:14863--14868, 1998.Google ScholarCross Ref
- D. Garcia. Coatis, an nlp system to locate expressions of actions connected by causality links. In Proc. of EKAW, 1997. Google ScholarDigital Library
- R. Girju and D. Moldovan. Text mining for causal relations. InProc. of FLAIRS, pages 360--364, 2002. Google ScholarDigital Library
- O. Glickman, I. Dagan, and M. Koppel. A probabilistic classification approach for lexical textual entailment. InProc. of AAAI, 2005. Google ScholarDigital Library
- M. Palmer H. Dang, K. Kipper and J. Rosenzweig. Investigating regular sense extensions based on intersective levin classes. In Proc. of Coling-ACL, 1998. Google ScholarDigital Library
- A. Jatowt and C.M Yeung. Extracting collective expectations about the future from large text collections. In Proc. of CIKM, 2011. Google ScholarDigital Library
- N. Ryant K. Kipper, A. Korhonen and M. Palmer. Extending verbnet with novel verb classes. In Proc. of LREC, 2006.Google Scholar
- R. Kaplan and G. Berry-Rogghe. Knowledge-based acquisition of causal relationships in text. Knowledge Acquisition, 3:317--337, 1991. Google ScholarDigital Library
- C. Khoo, S. Chan, and Y. Niu. Extracting causal knowledge from a medical database using graphical patterns. InProc. of ACL, pages 336--343, 2000. Google ScholarDigital Library
- J. Kim. Supervenience and mind. Selected Philosophical Essays, 1993.Google Scholar
- M. Lapata and A. Lascarides. Learning sentence-internal temporal relations. JAIR, 27:85--117, 2006. Google ScholarDigital Library
- Douglas B. Lenat and R. V. Guha. Building Large Knowledge-Based Systems: Representation and Inference in the Cyc Project. Addison-Wesley, 1990. Google ScholarDigital Library
- B. Levin and M. Rappaport Hovav. A preliminary analysis of causative verbs in english. Lingua, 92:35--77, 1994.Google ScholarCross Ref
- D. Lin and P. Pantel. Dirt-discovery of inference rules from text. In Proc. of KDD, 2001. Google ScholarDigital Library
- X. Ling and D. Weld. Temporal information extraction. In Proc. of AAAI, 2010.Google Scholar
- H. Liu and P. Singh. Conceptnet: A practical commonsense reasoning toolkit.BT Technology Journal, 22, 2004. Google ScholarDigital Library
- I. Mani, B. Schiffman, and J. Zhang. Inferring temporal ordering of events in news. In Proc. of HLT-NAACL 2003, 2003. Google ScholarDigital Library
- M. Marneffe, B. MacCartney, and C.D Manning. Generating typed dependency parses from phrase structure parses. In Proc. of LREC, 2006.Google Scholar
- J. Michel, Y.K Shen, A. Aiden, A. Veres, M. Gray, Google Books Team, J. Pickett, D. Hoiberg, D. Clancy, P. Norvig, J. Orwant, S. Pinker, M. Nowak, and E. Aiden. Quantitative analysis of culture using millions of digitized books.Science, 331:176--182, 2011.Google ScholarCross Ref
- G. Miller. Wordnet: A lexical database for english. CACM, 38:39--41, 1995. Google ScholarDigital Library
- H. Mili E. Bicknell Rada, R. and M. Blettner. Development and application of a metric to semantic nets.IEEE Transactions on Systems, Man and Cybernetics, 19(1):17--30, 1989.Google ScholarCross Ref
- E. Riloff. Automatically Generating Extraction Patterns from Untagged Text. In Proc. of AAAI, 1996. Google ScholarDigital Library
- E. Riloff and R. Jones. Learning dictionaries for information extraction by multi-level bootstrapping. In Proc. of AAAI, 1999. Google ScholarDigital Library
- L. Schubert. Can we derive general world knowledge from texts? In Proc. of HLT 2002, 2002. Google ScholarDigital Library
- D. Shahaf and C. Guestrin. Connecting the dots between news articles. In Proc. of KDD, 2010. Google ScholarDigital Library
- A. Sil, F. Huang, and A. Yates. Extracting action and event semantics from web text. In Proc. of AAAI Fall Symposium on Commonsense Knowledge, 2010.Google Scholar
- Michael Strube and Simone Paolo Ponzetto. Wikirelate! computing semantic relatedness using wikipedia. In Proc. of AAAI, 2006. Google ScholarDigital Library
- F. Suchanek, G Kasneci, and G. Weikum. Yago: a core of semantic knowledge. InProc. of WWW, 2007. Google ScholarDigital Library
- M. Tatu and M. Srikanth. Experiments with reasoning for temporal relations between events. In Proc. of COLING, 2008. Google ScholarDigital Library
- P. Wolff, G. Song, and D. Driscoll. Models of causation and causal verbs. InProc. of ACL, 2002.Google Scholar
- K. Yoshikawa, S. Riedel, M. Asahara, and Y. Matsumoto. Jointly identifying temporal relations with markov logic. In Proc. of ACL-IJCNLP, 2009. Google ScholarDigital Library
Index Terms
- Learning causality for news events prediction
Recommendations
Mining the web to predict future events
WSDM '13: Proceedings of the sixth ACM international conference on Web search and data miningWe describe and evaluate methods for learning to forecast forthcoming events of interest from a corpus containing 22 years of news stories. We consider the examples of identifying significant increases in the likelihood of disease outbreaks, deaths, and ...
News events prediction using Markov logic networks
Predicting future events from text data has been a controversial and much disputed topic in the field of text analytics. However, far too little attention has been paid to efficient prediction in textual environments. This study has aimed to develop a ...
Event Prediction Based on Causality Reasoning
Intelligent Information and Database SystemsAbstractEvent prediction is a challenging task which is controversial and highly debatable in the area of text mining. Previous works in utilizing causality reasoning for event prediction have shown a promising outcome in recent years. Many causality-...
Comments