skip to main content
10.5555/1572364.1572374dlproceedingsArticle/Chapter ViewAbstractPublication PagesbionlpConference Proceedingsconference-collections
research-article
Free Access

Disambiguation of biomedical abbreviations

Published:04 June 2009Publication History

ABSTRACT

Abbreviations are common in biomedical documents and many are ambiguous in the sense that they have several potential expansions. Identifying the correct expansion is necessary for language understanding and important for applications such as document retrieval. Identifying the correct expansion can be viewed as a Word Sense Disambiguation (WSD) problem. A WSD system that uses a variety of knowledge sources, including two types of information specific to the biomedical domain, is also described. This system was tested on a corpus of ambiguous abbreviations, created by automatically identifying the correct expansion in Medline abstracts, and found to identify the correct expansion with up to 99% accuracy.

References

  1. E. Adar. 2004. SaRAD: A simple and robust abbreviation dictionary. Bioinformatics, 20(4):527--533. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. E. Agirre and D. Martínez. 2004. The Basque Country University system: English and Basque tasks. In Rada Mihalcea and Phil Edmonds, editors, Senseval-3: Third International Workshop on the Evaluation of Systems for the Semantic Analysis of Text, pages 44--48, Barcelona, Spain, July.Google ScholarGoogle Scholar
  3. A. Aronson. 2001. Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program. In Proceedings of the American Medical Informatics Association (AMIA), pages 17--21.Google ScholarGoogle Scholar
  4. R. Artstein and M. Poesio. 2008. Inter-coder agreement for computational linguistics. Computational Linguistics, 34(4):555--596. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. J. Chang, H. Schütze, and R. Altman. 2002. Creating an Online Dictionary of Abbreviations from MEDLINE. The Journal of the American Medical Informatics Association, 9(6):612--620.Google ScholarGoogle ScholarCross RefCross Ref
  6. H. Fred and T. Cheng. 1999. Acronymesis: the exploding misuse of acronyms. Texas Heart Institute Journal, 30:255--257.Google ScholarGoogle Scholar
  7. S. Gaudan, H. Kirsch, and D. Rebholz-Schuhmann. 2005. Resolving abbreviations to their senses in Medline. Bioinformatics, 21(18):3658--3664. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. M. Joshi, T. Pedersen, and R. Maclin. 2005. A Comparative Study of Support Vector Machines Applied to the Word Sense Disambiguation Problem for the Medical Domain. In Proceedings of the Second Indian Conference on Artificial Intelligence (IICAI-05), pages 3449--3468, Pune, India.Google ScholarGoogle Scholar
  9. M. Joshi, S. Pakhomov, T. Pedersen, and C. Chute. 2006. A comparative study of supervised learning as applied to acronym expansion in clinical reports. In Proceedings of the Annual Symposium of the American Medical Informatics Association, pages 399--403, Washington, DC.Google ScholarGoogle Scholar
  10. A. Kilgarriff. 1993. Dictionary word sense distinctions: An enquiry into their nature. Computers and the Humanities, 26:356--387.Google ScholarGoogle Scholar
  11. H. Liu, Y. Lussier, and C. Friedman. 2001. Disambiguating ambiguous biomedical terms in biomedical narrative text: An unsupervised method. Journal of Biomedical Informatics, 34:249--261. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. H. Liu, S. Johnson, and C. Friedman. 2002. Automatic Resolution of Ambiguous Terms Based on Machine Learning and Conceptual Relations in the UMLS. Journal of the American Medical Informatics Association, 9(6):621--636.Google ScholarGoogle ScholarCross RefCross Ref
  13. H. Liu, V. Teller, and C. Friedman. 2004. A Multi-aspect Comparison Study of Supervised Word Sense Disambiguation. Journal of the American Medical Informatics Association, 11(4):320--331.Google ScholarGoogle ScholarCross RefCross Ref
  14. B. McInnes, T. Pedersen, and J. Carlis. 2007. Using UMLS Concept Unique Identifiers (CUIs) for Word Sense Disambiguation in the Biomedical Domain. In Proceedings of the Annual Symposium of the American Medical Informatics Association, pages 533--537, Chicago, IL.Google ScholarGoogle Scholar
  15. R. Mihalcea, T. Chklovski, and A. Kilgarriff. 2004. The Senseval-3 English lexical sample task. In Proceedings of Senseval-3: The Third International Workshop on the Evaluation of Systems for the Semantic Analysis of Text, Barcelona, Spain.Google ScholarGoogle Scholar
  16. S. Nelson, T. Powell, and B. Humphreys. 2002. The Unified Medical Language System (UMLS) Project. In Allen Kent and Carolyn M. Hall, editors, Encyclopedia of Library and Information Science. Marcel Dekker, Inc.Google ScholarGoogle Scholar
  17. H. Ng, B. Wang, and S. Chan. 2003. Exploiting Parallel Texts for Word Sense Disambiguation: an Empirical Study. In Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics (ACL-03), pages 455--462, Sapporo, Japan. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. N. Okazaki, S. Ananiadou, and J. Tsujii. 2008. A discriminative alignment model for abbreviation recognition. In Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008), pages 657--664, Manchester, UK. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. S. Pakhomov. 2002. Semi-supervised maximum entropy based approach to acronym and abbreviation normalization in medical texts. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, pages 160--167, Philadelphia, PA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. T. Pedersen. 2001. A Decision Tree of Bigrams is an Accurate Predictor of Word Sense. In Proceedings of the Second Meeting of the North American Chapter of the Association for Computational Linguistics (NAACL-01), pages 79--86, Pittsburgh, PA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. J. Pustejovsky, J. Castano, R. Saur, A. Rumshisky, J. Zhang, and W. Luo. 2002. Medstract: Creating Large-scale Information Servers for Biomedical Libraries. In ACL 2002 Workshop on Natural Language Processing in the Biomedical Domain. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. A. Schwartz and M. Hearst. 2003. A simple algorithm for identifying abbreviation definitions in biomedical text. In Proceedings of the Pacific Symposium on Biocomputing, Kauai.Google ScholarGoogle Scholar
  23. M. Stevenson, Y. Guo, R. Gaizauskas, and D. Martinez. 2008. Disambiguation of biomedical text using diverse sources of information. BMC Bioinformatics, 9(Suppl 11):S7.Google ScholarGoogle ScholarCross RefCross Ref
  24. M. Weeber, J. Mork, and A. Aronson. 2001. Developing a Test Collection for Biomedical Word Sense Disambiguation. In Proceedings of AMAI Symposium, pages 746--50, Washington, DC.Google ScholarGoogle Scholar
  25. I. Witten and E. Frank. 2005. Data Mining: Practical machine learning tools and techniques. Morgan Kaufmann, San Francisco. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. H. Xu, J. Fan, G. Hripcsak, E. Mendonça, Markatou M., and Friedman C. 2007. Gene symbol disambiguation using knowledge-based profiles. Bioinformatics, 23(8):1015--22. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. H. Yu, W. Kim, V. Hatzivassiloglou, and J. Wilbur. 2006. A large scale, corpus-based approach for automatically disambigutaing biomedical abbreviations. ACM Transactions on Information Systems, 24(3):380--404. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. W. Zhou, I. Vetle, and N. Smalheiser. 2006. ADAM: another database of abbreviations in MEDLINE. Bioinformatics, 22(22):2813--2818. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Disambiguation of biomedical abbreviations

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image DL Hosted proceedings
        BioNLP '09: Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing
        June 2009
        214 pages
        ISBN:9781932432305
        • Conference Chairs:
        • Kevin Bretonnel Cohen,
        • Dina Demner-Fushman,
        • Sophia Ananiadou,
        • John Pestian,
        • Jun'ichi Tsujii,
        • Bonnie Webber

        Publisher

        Association for Computational Linguistics

        United States

        Publication History

        • Published: 4 June 2009

        Qualifiers

        • research-article

        Acceptance Rates

        BioNLP '09 Paper Acceptance Rate12of29submissions,41%Overall Acceptance Rate33of92submissions,36%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader