ABSTRACT
Deriving pseudo causal relations from medical text data lies at the heart of medical literature mining. Existing studies have utilized extraction models to find pseudo causal relation from single sentences, while the knowledge created by causation transitivity - often spanning multiple sentences - has not been considered. Furthermore, we observe that many pseudo causal relations follow the rule of causation transitivity, which makes it possible to discover unseen casual relations and generate new causal relation hypotheses. In this paper, we address these two issues by proposing a factor graph model to incorporate three clues to discover causation expressions in the text data. We propose four types of triad structures to represent the rules of causation transitivity among causal relations. Our proposed model, called CausalTriad, uses textual and structural knowledge to infer pseudo causal relations from the triad structures. Experimental results on two datasets demonstrate that (a) CausalTriad is effective for pseudo causal relation discovery within and across sentences; (b) CausalTriad is highly capable at recognizing implicit pseudo causal relations; (c) CausalTriad can infer missing/new pseudo causal relations from text data.
- David Arbour, Dan Garant, and David Jensen. 2016. Inferring Network Effects from Observational Data. In KDD'16. 715--724. Google ScholarDigital Library
- Michael Bada, Miriam Eckert, Donald Evans, Kristin Garcia, Krista Shipley, Dmitry Sitnikov, William A Baumgartner, K Bretonnel Cohen, Karin Verspoor, Judith A Blake, et al. 2012. Concept annotation in the CRAFT corpus. BMC bioinformatics 13, 1 (2012), 1.Google Scholar
- S. Siegel Castellan. 1988. Non-Parametric Statistics for The Behavioral Sciences. McGraw-Hill,. 3 pages.Google Scholar
- Wanxiang Che, Zhenghua Li, and Ting Liu. 2010. Ltp: A chinese language technology platform. In Coling'10. 13--16. Google ScholarDigital Library
- Angela M Coventry. 2006. Hume's Theory of Causation. A&C Black.Google Scholar
- Quang Xuan Do, Yee Seng Chan, and Dan Roth. 2011. Minimally Supervised Event Causality Identification. In EMNLP'11. 294-- 303. Google ScholarDigital Library
- Samuel G Finlayson, Paea LePendu, and Nigam H Shah. 2014. Building the graph of medicine from millions of clinical narratives. Scientific data 1 (2014).Google Scholar
- Chikara Hashimoto, Kentaro Torisawa, Julien Kloetzer, and Jong Hoon Oh. 2015. Generating event causality hypotheses through semantic relations. In AAAI'15. 2396--2403. Google ScholarDigital Library
- Christopher Hidey and Kathleen McKeown. 2016. Identifying Causal Relations Using Parallel Wikipedia Articles. In ACL'16. 1424--1433.Google Scholar
- Patrik O Hoyer, Dominik Janzing, Joris M Mooij, Jonas Peters, and Bernhard Sch¨olkopf. 2009. Nonlinear causal discovery with additive noise models. In NIPS'09. 689--696. Google ScholarDigital Library
- Sarvnaz Karimi, Chen Wang, Alejandro Metke-Jimenez, Raj Gaire, and Cecile Paris. 2015. Text and data mining techniques in adverse drug reaction detection. Comput. Surveys 47, 4 (2015), 56. Google ScholarDigital Library
- Paul L. A. and Hall Ned. 2013. Causation: A User's Guide. Oxford University Press.Google Scholar
- Claudia Leacock and Martin Chodorow. 1998. Combining local context and WordNet similarity for word sense identification. WordNet: An electronic lexical database 49, 2 (1998), 265--283.Google Scholar
- Christopher D. Manning, Mihai Surdeanu, John Bauer, Jenny Finkel, Steven J. Bethard, and David Mcclosky. 2014. The Stanford CoreNLP Natural Language Processing Toolkit. In ACL'14.Google Scholar
- Alejandro Metke-Jimenez and Sarvnaz Karimi. 2015. Concept Extraction to Identify Adverse Drug Reactions in Medical Forums: A Comparison of Algorithms. CoRR (2015).Google Scholar
- Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. {n. d.}. Distributed representations of words and phrases and their compositionality. In NIPS'13. Google ScholarDigital Library
- Joris M Mooij, Jonas Peters, Dominik Janzing, Jakob Zscheischler, and Bernhard Sch¨olkopf. 2016. Distinguishing cause from effect using observational data: methods and benchmarks. JMLR 17, 32 (2016), 1--102. Google ScholarDigital Library
- Aur´elie N´ev´eol, Sonya E Shooshan, Susanne M Humphrey, James G Mork, and Alan R Aronson. 2009. A recent advance in the automatic indexing of the biomedical literature. Journal of biomedical informatics 42, 5 (2009), 814--823. Google ScholarDigital Library
- Thien Huu Nguyen and Ralph Grishman. 2015. Relation Extraction: Perspective from Convolutional Neural Networks. In NAACL'15. 39--48.Google ScholarCross Ref
- Nanyun Peng, Hoifung Poon, Chris Quirk, Kristina Toutanova, and Wen-tau Yih. 2017. Cross-Sentence N-ary Relation Extraction with Graph LSTMs. Transactions of the Association for Computational Linguistics 5 (2017), 101--115.Google ScholarCross Ref
- Jeffrey Pennington, Richard Socher, and Christopher Manning. 2014. Glove: Global Vectors for Word Representation. In EMNLP'14. 1532--1543.Google Scholar
- Chris Quirk, Pallavi Choudhury, Jianfeng Gao, Hisami Suzuki, Kristina Toutanova, Michael Gamon, Wen Tau Yih, Lucy Vanderwende, and Colin Cherry. 2012. MSR SPLAT, a language analysis toolkit. In NAACL'12. 21--24. Google ScholarDigital Library
- Kira Radinsky, Sagie Davidovich, and Shaul Markovitch. 2012. Learning Causality for News Events Prediction. In WWW. 909-- 918. Google ScholarDigital Library
- Mehwish Riaz and Roxana Girju. 2014. Recognizing causality in verb-noun pairs via noun and verb semantics. In EACL'14.Google ScholarCross Ref
- Patrick Suppes. 1970. A probabilistic theory of causality. NorthHolland Publishing Company Amsterdam.Google Scholar
- Phillip Wolff and Grace Song. 2003. Models of causation and the semantics of causal verbs. Cognitive Psychology 47, 3 (2003), 276--332.Google ScholarCross Ref
- R. Xue, Z. Fang, M. Zhang, Z. Yi, C. Wen, and T. Shi. 2013. TCMID: Traditional Chinese Medicine integrative database for herb molecular mechanism analysis. Nucleic Acids Research 41 (2013).Google Scholar
- Hong Yu and Eugene Agichtein. 2003. Extracting synonymous gene and protein terms from biological literature. Bioinformatics 19 (2003), 1340--1349.Google ScholarCross Ref
- Ce Zhang. 2015. DeepDive: a data management system for automatic knowledge base construction. Ph.D. Dissertation. Citeseer.Google Scholar
- Sendong Zhao, Quan Wang, Sean Massung, Bing Qin, Ting Liu, Bin Wang, and ChengXiang Zhai. 2017. Constructing and Embedding Abstract Event Causality Networks from Text Snippets. In WSDM'17. 335--344. Google ScholarDigital Library
Index Terms
- CausalTriad: Toward Pseudo Causal Relation Discovery and Hypotheses Generation from Medical Text Data
Recommendations
BibClus: A Clustering Algorithm of Bibliographic Networks by Message Passing on Center Linkage Structure
ICDM '11: Proceedings of the 2011 IEEE 11th International Conference on Data MiningMulti-type objects with multi-type relations are ubiquitous in real-world networks, e.g. bibliographic networks. Such networks are also called heterogeneous information networks. However, the research on clustering for heterogeneous information networks ...
Survey of Probabilistic Graphical Models
WISA '13: Proceedings of the 2013 10th Web Information System and Application ConferenceProbabilistic graphical model (PGM) is a generic model that represents the probability-based relationships among random variables by a graph, and is a general method for knowledge representation and inference involving uncertainty. In recent years, PGM ...
Some properties of Rényi entropy and Rényi entropy rate
In this paper, we define the conditional Renyi entropy and show that the so-called chain rule holds for the Renyi entropy. Then, we introduce a relation for the rate of Renyi entropy and use it to derive the rate of the Renyi entropy for an irreducible-...
Comments