research-article

CausalTriad: Toward Pseudo Causal Relation Discovery and Hypotheses Generation from Medical Text Data

Authors:
Sendong Zhao

Harbin Institute of Technology, Harbin, China

Harbin Institute of Technology, Harbin, China
View Profile

,
Meng Jiang

University of Notre Dame, South Bend, USA

University of Notre Dame, South Bend, USA
View Profile

,
Ming Liu

Harbin Institute of Technology, Harbin, China

Harbin Institute of Technology, Harbin, China
View Profile

,
Bing Qin

Harbin Institute of Technology, Harbin, China

Harbin Institute of Technology, Harbin, China
View Profile

,
Ting Liu

Harbin Institute of Technology, Harbin, China

Harbin Institute of Technology, Harbin, China
View Profile

BCB '18: Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health InformaticsAugust 2018Pages 184–193https://doi.org/10.1145/3233547.3233555

Published:15 August 2018Publication History

BCB '18: Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics

Pages 184–193

ABSTRACT

Deriving pseudo causal relations from medical text data lies at the heart of medical literature mining. Existing studies have utilized extraction models to find pseudo causal relation from single sentences, while the knowledge created by causation transitivity - often spanning multiple sentences - has not been considered. Furthermore, we observe that many pseudo causal relations follow the rule of causation transitivity, which makes it possible to discover unseen casual relations and generate new causal relation hypotheses. In this paper, we address these two issues by proposing a factor graph model to incorporate three clues to discover causation expressions in the text data. We propose four types of triad structures to represent the rules of causation transitivity among causal relations. Our proposed model, called CausalTriad, uses textual and structural knowledge to infer pseudo causal relations from the triad structures. Experimental results on two datasets demonstrate that (a) CausalTriad is effective for pseudo causal relation discovery within and across sentences; (b) CausalTriad is highly capable at recognizing implicit pseudo causal relations; (c) CausalTriad can infer missing/new pseudo causal relations from text data.

References

David Arbour, Dan Garant, and David Jensen. 2016. Inferring Network Effects from Observational Data. In KDD'16. 715--724. Google ScholarDigital Library
Michael Bada, Miriam Eckert, Donald Evans, Kristin Garcia, Krista Shipley, Dmitry Sitnikov, William A Baumgartner, K Bretonnel Cohen, Karin Verspoor, Judith A Blake, et al. 2012. Concept annotation in the CRAFT corpus. BMC bioinformatics 13, 1 (2012), 1.Google Scholar
S. Siegel Castellan. 1988. Non-Parametric Statistics for The Behavioral Sciences. McGraw-Hill,. 3 pages.Google Scholar
Wanxiang Che, Zhenghua Li, and Ting Liu. 2010. Ltp: A chinese language technology platform. In Coling'10. 13--16. Google ScholarDigital Library
Angela M Coventry. 2006. Hume's Theory of Causation. A&C Black.Google Scholar
Quang Xuan Do, Yee Seng Chan, and Dan Roth. 2011. Minimally Supervised Event Causality Identification. In EMNLP'11. 294-- 303. Google ScholarDigital Library
Samuel G Finlayson, Paea LePendu, and Nigam H Shah. 2014. Building the graph of medicine from millions of clinical narratives. Scientific data 1 (2014).Google Scholar
Chikara Hashimoto, Kentaro Torisawa, Julien Kloetzer, and Jong Hoon Oh. 2015. Generating event causality hypotheses through semantic relations. In AAAI'15. 2396--2403. Google ScholarDigital Library
Christopher Hidey and Kathleen McKeown. 2016. Identifying Causal Relations Using Parallel Wikipedia Articles. In ACL'16. 1424--1433.Google Scholar
Patrik O Hoyer, Dominik Janzing, Joris M Mooij, Jonas Peters, and Bernhard Sch¨olkopf. 2009. Nonlinear causal discovery with additive noise models. In NIPS'09. 689--696. Google ScholarDigital Library
Sarvnaz Karimi, Chen Wang, Alejandro Metke-Jimenez, Raj Gaire, and Cecile Paris. 2015. Text and data mining techniques in adverse drug reaction detection. Comput. Surveys 47, 4 (2015), 56. Google ScholarDigital Library
Paul L. A. and Hall Ned. 2013. Causation: A User's Guide. Oxford University Press.Google Scholar
Claudia Leacock and Martin Chodorow. 1998. Combining local context and WordNet similarity for word sense identification. WordNet: An electronic lexical database 49, 2 (1998), 265--283.Google Scholar
Christopher D. Manning, Mihai Surdeanu, John Bauer, Jenny Finkel, Steven J. Bethard, and David Mcclosky. 2014. The Stanford CoreNLP Natural Language Processing Toolkit. In ACL'14.Google Scholar
Alejandro Metke-Jimenez and Sarvnaz Karimi. 2015. Concept Extraction to Identify Adverse Drug Reactions in Medical Forums: A Comparison of Algorithms. CoRR (2015).Google Scholar
Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. {n. d.}. Distributed representations of words and phrases and their compositionality. In NIPS'13. Google ScholarDigital Library
Joris M Mooij, Jonas Peters, Dominik Janzing, Jakob Zscheischler, and Bernhard Sch¨olkopf. 2016. Distinguishing cause from effect using observational data: methods and benchmarks. JMLR 17, 32 (2016), 1--102. Google ScholarDigital Library
Aur´elie N´ev´eol, Sonya E Shooshan, Susanne M Humphrey, James G Mork, and Alan R Aronson. 2009. A recent advance in the automatic indexing of the biomedical literature. Journal of biomedical informatics 42, 5 (2009), 814--823. Google ScholarDigital Library
Thien Huu Nguyen and Ralph Grishman. 2015. Relation Extraction: Perspective from Convolutional Neural Networks. In NAACL'15. 39--48.Google ScholarCross Ref
Nanyun Peng, Hoifung Poon, Chris Quirk, Kristina Toutanova, and Wen-tau Yih. 2017. Cross-Sentence N-ary Relation Extraction with Graph LSTMs. Transactions of the Association for Computational Linguistics 5 (2017), 101--115.Google ScholarCross Ref
Jeffrey Pennington, Richard Socher, and Christopher Manning. 2014. Glove: Global Vectors for Word Representation. In EMNLP'14. 1532--1543.Google Scholar
Chris Quirk, Pallavi Choudhury, Jianfeng Gao, Hisami Suzuki, Kristina Toutanova, Michael Gamon, Wen Tau Yih, Lucy Vanderwende, and Colin Cherry. 2012. MSR SPLAT, a language analysis toolkit. In NAACL'12. 21--24. Google ScholarDigital Library
Kira Radinsky, Sagie Davidovich, and Shaul Markovitch. 2012. Learning Causality for News Events Prediction. In WWW. 909-- 918. Google ScholarDigital Library
Mehwish Riaz and Roxana Girju. 2014. Recognizing causality in verb-noun pairs via noun and verb semantics. In EACL'14.Google ScholarCross Ref
Patrick Suppes. 1970. A probabilistic theory of causality. NorthHolland Publishing Company Amsterdam.Google Scholar
Phillip Wolff and Grace Song. 2003. Models of causation and the semantics of causal verbs. Cognitive Psychology 47, 3 (2003), 276--332.Google ScholarCross Ref
R. Xue, Z. Fang, M. Zhang, Z. Yi, C. Wen, and T. Shi. 2013. TCMID: Traditional Chinese Medicine integrative database for herb molecular mechanism analysis. Nucleic Acids Research 41 (2013).Google Scholar
Hong Yu and Eugene Agichtein. 2003. Extracting synonymous gene and protein terms from biological literature. Bioinformatics 19 (2003), 1340--1349.Google ScholarCross Ref
Ce Zhang. 2015. DeepDive: a data management system for automatic knowledge base construction. Ph.D. Dissertation. Citeseer.Google Scholar
Sendong Zhao, Quan Wang, Sean Massung, Bing Qin, Ting Liu, Bin Wang, and ChengXiang Zhai. 2017. Constructing and Embedding Abstract Event Causality Networks from Text Snippets. In WSDM'17. 335--344. Google ScholarDigital Library

Index Terms

CausalTriad: Toward Pseudo Causal Relation Discovery and Hypotheses Generation from Medical Text Data
1. Applied computing
  1. Life and medical sciences
    1. Bioinformatics
    2. Health informatics
2. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Information extraction

Recommendations

BibClus: A Clustering Algorithm of Bibliographic Networks by Message Passing on Center Linkage Structure
ICDM '11: Proceedings of the 2011 IEEE 11th International Conference on Data Mining

Multi-type objects with multi-type relations are ubiquitous in real-world networks, e.g. bibliographic networks. Such networks are also called heterogeneous information networks. However, the research on clustering for heterogeneous information networks ...
Read More
Survey of Probabilistic Graphical Models
WISA '13: Proceedings of the 2013 10th Web Information System and Application Conference

Probabilistic graphical model (PGM) is a generic model that represents the probability-based relationships among random variables by a graph, and is a general method for knowledge representation and inference involving uncertainty. In recent years, PGM ...
Read More
Some properties of Rényi entropy and Rényi entropy rate

In this paper, we define the conditional Renyi entropy and show that the so-called chain rule holds for the Renyi entropy. Then, we introduce a relation for the rate of Renyi entropy and use it to derive the rate of the Renyi entropy for an irreducible-...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
BCB '18: Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics
August 2018
727 pages
ISBN:9781450357944
DOI:10.1145/3233547
General Chairs:
Amarda Shehu
George Mason University, USA
,
Cathy Wu
University of Delaware, USA
,
Program Chairs:
Christina Boucher
University of Florida, USA
,
Jing Li
Case Western Reserve University, USA
,
Hongfang Liu
Mayo Clinic, USA
,
Mihai Pop
University of Maryland, USA
Copyright © 2018 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 15 August 2018
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
causation transitivity rules
factor graph
medical causal relation discovery
structural inference
Qualifiers
- research-article
Conference

Acceptance Rates
BCB '18 Paper Acceptance Rate46of148submissions,31%Overall Acceptance Rate254of885submissions,29%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 12
  Total Citations
  View Citations
- 281
  Total Downloads
- Downloads (Last 12 months)30
- Downloads (Last 6 weeks)4
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

CausalTriad: Toward Pseudo Causal Relation Discovery and Hypotheses Generation from Medical Text Data

BCB '18: Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics

ABSTRACT

References

Cited By

Index Terms

Recommendations

BibClus: A Clustering Algorithm of Bibliographic Networks by Message Passing on Center Linkage Structure

Survey of Probabilistic Graphical Models

Some properties of Rényi entropy and Rényi entropy rate

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

CausalTriad: Toward Pseudo Causal Relation Discovery and Hypotheses Generation from Medical Text Data

BCB '18: Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics

ABSTRACT

References

Cited By

Index Terms

Recommendations

BibClus: A Clustering Algorithm of Bibliographic Networks by Message Passing on Center Linkage Structure

Survey of Probabilistic Graphical Models

Some properties of Rényi entropy and Rényi entropy rate

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media