skip to main content
10.1145/3535508.3545518acmconferencesArticle/Chapter ViewAbstractPublication PagesbcbConference Proceedingsconference-collections
short-paper
Public Access

Attention-based aspect reasoning for knowledge base question answering on clinical notes

Published:07 August 2022Publication History

ABSTRACT

Question Answering (QA) in clinical notes has gained a lot of attention in the past few years. Existing machine reading comprehension approaches in clinical domain can only handle questions about a single block of clinical texts and fail to retrieve information about multiple patients and their clinical notes. To handle more complex questions, we aim at creating knowledge base from clinical notes to link different patients and clinical notes, and performing knowledge base question answering (KBQA). Based on the expert annotations available in the n2c2 dataset, we first created the ClinicalKBQA dataset that includes around 9K QA pairs and covers questions about seven medical topics using more than 300 question templates. Then, we investigated an attention-based aspect reasoning (AAR) method for KBQA and analyzed the impact of different aspects of answers (e.g., entity, type, path, and context) for prediction. The AAR method achieves better performance due to the well-designed encoder and attention mechanism. From our experiments, we find that both aspects, type and path, enable the model to identify answers satisfying the general conditions and produce lower precision and higher recall. On the other hand, the aspects, entity and context, limit the answers by node-specific information and lead to higher precision and lower recall.

References

  1. Abdalghani Abujabal, Mohamed Yahya, Mirek Riedewald, and Gerhard Weikum. 2017. Automated template generation for question answering over knowledge graphs. In Proceedings of the 26th international conference on world wide web. 1191--1200.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Sofia J Athenikos and Hyoil Han. 2010. Biomedical question answering: A survey. Computer methods and programs in biomedicine 99, 1 (2010), 1--24.Google ScholarGoogle Scholar
  3. Jonathan Berant and Percy Liang. 2014. Semantic parsing via paraphrasing. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 1415--1425.Google ScholarGoogle ScholarCross RefCross Ref
  4. Olivier Bodenreider. 2004. The unified medical language system (UMLS): integrating biomedical terminology. Nucleic acids research 32, suppl_1 (2004), D267--D270.Google ScholarGoogle Scholar
  5. Kurt Bollacker, Colin Evans, Praveen Paritosh, Tim Sturge, and Jamie Taylor. 2008. Freebase: a collaboratively created graph database for structuring human knowledge. In Proceedings of the 2008 ACM SIGMOD international conference on Management of data. AcM, 1247--1250.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Antoine Bordes, Sumit Chopra, and Jason Weston. 2014. Question Answering with Subgraph Embeddings. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). 615--620.Google ScholarGoogle ScholarCross RefCross Ref
  7. Antoine Bordes, Nicolas Usunier, Sumit Chopra, and Jason Weston. 2015. Large-scale simple question answering with memory networks. arXiv preprint arXiv:1506.02075 (2015).Google ScholarGoogle Scholar
  8. Antoine Bordes, Jason Weston, and Nicolas Usunier. 2014. Open question answering with weakly supervised embedding models. In Joint European conference on machine learning and knowledge discovery in databases. Springer, 165--180.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Yu Chen, Lingfei Wu, and Mohammed J Zaki. 2019. Bidirectional Attentive Memory Networks for Question Answering over Knowledge Bases. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 2913--2923.Google ScholarGoogle ScholarCross RefCross Ref
  10. Zi-Yuan Chen, Chih-Hung Chang, Yi-Pei Chen, Jijnasa Nayak, and Lun-Wei Ku. 2019. UHop: An Unrestricted-Hop Relation Extraction Framework for Knowledge-Based Question Answering. In Proceedings of NAACL-HLT. 345--356.Google ScholarGoogle Scholar
  11. Wanyun Cui, Yanghua Xiao, Haixun Wang, Yangqiu Song, Seung-won Hwang, and Wei Wang. 2019. KBQA: learning question answering over QA corpora and knowledge bases. In Proceedings of the VLDB Endowment. 565--576.Google ScholarGoogle Scholar
  12. Dennis Diefenbach, Vanessa Lopez, Kamal Singh, and Pierre Maret. 2018. Core techniques of question answering systems over knowledge bases: a survey. Knowledge and Information systems 55, 3 (2018), 529--569.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Li Dong, Furu Wei, Ming Zhou, and Ke Xu. 2015. Question answering over freebase with multi-column convolutional neural networks. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 260--269.Google ScholarGoogle ScholarCross RefCross Ref
  14. K Donnelly. 2006. SNOMED-CT: The advanced terminology and coding system for eHealth. Studies in health technology and informatics 121 (2006), 279--290.Google ScholarGoogle Scholar
  15. Yanchao Hao, Yuanzhe Zhang, Kang Liu, Shizhu He, Zhanyi Liu, Hua Wu, and Jun Zhao. 2017. An end-to-end model for question answering over knowledge base with cross-attention combining global knowledge. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 221--231.Google ScholarGoogle ScholarCross RefCross Ref
  16. Johannes Hoffart, Fabian M Suchanek, Klaus Berberich, Edwin Lewis-Kelham, Gerard De Melo, and Gerhard Weikum. 2011. YAGO2: exploring and querying world knowledge in time, space, context, and many languages. In Proceedings of the 20th international conference companion on World wide web. 229--232.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Qiao Jin, Bhuwan Dhingra, Zhengping Liu, William Cohen, and Xinghua Lu. 2019. PubMedQA: A Dataset for Biomedical Research Question Answering. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 2567--2577.Google ScholarGoogle ScholarCross RefCross Ref
  18. Qiao Jin, Zheng Yuan, Guangzhi Xiong, Qianlan Yu, Chuanqi Tan, Mosha Chen, Songfang Huang, Xiaozhong Liu, and Sheng Yu. 2021. Biomedical Question Answering: A Comprehensive Review. arXiv preprint arXiv:2102.05281 (2021).Google ScholarGoogle Scholar
  19. Diederik P Kingma and Jimmy Ba. 2015. Adam: A method for stochastic optimization. International Conference on Learning Representations (2015).Google ScholarGoogle Scholar
  20. Tom Kwiatkowski, Eunsol Choi, Yoav Artzi, and Luke Zettlemoyer. 2013. Scaling semantic parsers with on-the-fly ontology matching. In Proceedings of the 2013 conference on empirical methods in natural language processing. 1545--1556.Google ScholarGoogle Scholar
  21. Yunshi Lan, Gaole He, Jinhao Jiang, Jing Jiang, Wayne Xin Zhao, and Ji-Rong Wen. 2021. A survey on complex knowledge base question answering: Methods, challenges and solutions. arXiv preprint arXiv:2105.11644 (2021).Google ScholarGoogle Scholar
  22. Chen Liang, Jonathan Berant, Quoc Le, Kenneth Forbus, and Ni Lao. 2017. Neural Symbolic Machines: Learning Semantic Parsers on Freebase with Weak Supervision. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 23--33.Google ScholarGoogle ScholarCross RefCross Ref
  23. Anusri Pampari, Preethi Raghavan, Jennifer Liang, and Jian Peng. 2018. emrQA: A Large Corpus for Question Answering on Electronic Medical Records. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 2357--2368.Google ScholarGoogle ScholarCross RefCross Ref
  24. Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. 2017. Automatic differentiation in pytorch. Proceedings of the 31st Conference on Neural Information Processing Systems (2017), 1--43.Google ScholarGoogle Scholar
  25. Pranav Rajpurkar, Jian Zhang, Konstantin Lopyrev, and Percy Liang. 2016. SQuAD: 100,000+ Questions for Machine Comprehension of Text. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. 2383--2392.Google ScholarGoogle ScholarCross RefCross Ref
  26. Siva Reddy, Mirella Lapata, and Mark Steedman. 2014. Large-scale semantic parsing without question-answer pairs. Transactions of the Association for Computational Linguistics 2 (2014), 377--392.Google ScholarGoogle ScholarCross RefCross Ref
  27. Maya Rotmensch, Yoni Halpern, Abdulhakim Tlimat, Steven Horng, and David Sontag. 2017. Learning a health knowledge graph from electronic medical records. Scientific reports 7, 1 (2017), 1--11.Google ScholarGoogle Scholar
  28. Amber Stubbs, Christopher Kotfila, Hua Xu, and Özlem Uzuner. 2015. Identifying risk factors for heart disease over time: Overview of 2014 i2b2/UTHealth shared task Track 2. Journal of biomedical informatics 58 (2015), S67--S77.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Weiyi Sun, Anna Rumshisky, and Ozlem Uzuner. 2013. Evaluating temporal relations in clinical text: 2012 i2b2 Challenge. Journal of the American Medical Informatics Association 20, 5 (2013), 806--813.Google ScholarGoogle ScholarCross RefCross Ref
  30. Simon Suster and Walter Daelemans. 2018. CliCR: a Dataset of Clinical Case Reports for Machine Reading Comprehension. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). 1551--1563.Google ScholarGoogle ScholarCross RefCross Ref
  31. George Tsatsaronis, Georgios Balikas, Prodromos Malakasiotis, Ioannis Partalas, Matthias Zschunke, Michael R Alvers, Dirk Weissenborn, Anastasia Krithara, Sergios Petridis, Dimitris Polychronopoulos, et al. 2015. An overview of the BIOASQ large-scale biomedical semantic indexing and question answering competition. BMC bioinformatics 16, 1 (2015), 138.Google ScholarGoogle Scholar
  32. Özlem Uzuner. 2009. Recognizing obesity and comorbidities in sparse data. Journal of the American Medical Informatics Association 16, 4 (2009), 561--570.Google ScholarGoogle ScholarCross RefCross Ref
  33. Ozlem Uzuner, Andreea Bodnari, Shuying Shen, Tyler Forbush, John Pestian, and Brett R South. 2012. Evaluating the state of the art in coreference resolution for electronic medical records. Journal of the American Medical Informatics Association 19, 5 (2012), 786--791.Google ScholarGoogle ScholarCross RefCross Ref
  34. Özlem Uzuner, Ira Goldstein, Yuan Luo, and Isaac Kohane. 2008. Identifying patient smoking status from medical discharge records. Journal of the American Medical Informatics Association 15, 1 (2008), 14--24.Google ScholarGoogle ScholarCross RefCross Ref
  35. Özlem Uzuner, Imre Solti, and Eithon Cadag. 2010. Extracting medication information from clinical text. Journal of the American Medical Informatics Association 17, 5 (2010), 514--518.Google ScholarGoogle ScholarCross RefCross Ref
  36. Özlem Uzuner, Imre Solti, Fei Xia, and Eithon Cadag. 2010. Community annotation experiment for ground truth generation for the i2b2 medication challenge. Journal of the American Medical Informatics Association 17, 5 (2010), 519--523.Google ScholarGoogle ScholarCross RefCross Ref
  37. Ping Wang, Tian Shi, and Chandan K Reddy. 2020. Text-to-SQL Generation for Question Answering on Electronic Medical Records. In Proceedings of The Web Conference 2020. 350--361.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Xuchen Yao and Benjamin Van Durme. 2014. Information extraction over structured data: Question answering with freebase. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 956--966.Google ScholarGoogle ScholarCross RefCross Ref
  39. Tao Yu, Rui Zhang, Kai Yang, Michihiro Yasunaga, Dongxu Wang, Zifan Li, James Ma, Irene Li, Qingning Yao, Shanelle Roman, et al. 2018. Spider: A Large-Scale Human-Labeled Dataset for Complex and Cross-Domain Semantic Parsing and Text-to-SQL Task. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 3911--3921.Google ScholarGoogle ScholarCross RefCross Ref
  40. Victor Zhong, Caiming Xiong, and Richard Socher. 2017. Seq2SQL: Generating Structured Queries from Natural Language using Reinforcement Learning. arXiv preprint arXiv:1709.00103 (2017).Google ScholarGoogle Scholar

Index Terms

  1. Attention-based aspect reasoning for knowledge base question answering on clinical notes

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            BCB '22: Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics
            August 2022
            549 pages
            ISBN:9781450393867
            DOI:10.1145/3535508

            Copyright © 2022 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 7 August 2022

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • short-paper

            Acceptance Rates

            Overall Acceptance Rate254of885submissions,29%

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader