skip to main content
10.5555/1572364.1572369dlproceedingsArticle/Chapter ViewAbstractPublication PagesbionlpConference Proceedingsconference-collections
research-article
Free Access

Learning the scope of hedge cues in biomedical texts

Published:04 June 2009Publication History

ABSTRACT

Identifying hedged information in biomedical literature is an important subtask in information extraction because it would be misleading to extract speculative information as factual information. In this paper we present a machine learning system that finds the scope of hedge cues in biomedical texts. The system is based on a similar system that finds the scope of negation cues. We show that the same scope finding approach can be applied to both negation and hedging. To investigate the robustness of the approach, the system is tested on the three subcorpora of the BioScope corpus that represent different text types.

References

  1. S. Buchholz and E. Marsi. 2006. CoNLL-X shared task on multilingual dependency parsing. In Proc. of the X CoNLL Shared Task, New York. SIGNLL. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. N. Collier, H. S. Park, N. Ogata, Y. Tateisi, C. Nobata, T. Sekimizu, H. Imai, and J. Tsujii. 1999. The GENIA project: corpus-based knowledge acquisition and information extraction from genome research papers. In Proc. of EACL 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. T. M. Cover and P. E. Hart. 1967. Nearest neighbor pattern classification. Institute of Electrical and Electronics Engineers Transactions on Information Theory, 13:21--27.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. W. Daelemans, J. Zavrel, K. Van der Sloot, and A. Van den Bosch. 2007. TiMBL: Tilburg memory based learner, version 6.1, reference guide. Technical Report Series 07-07, ILK, Tilburg, The Netherlands.Google ScholarGoogle Scholar
  5. C. Di Marco and R. E. Mercer, 2005. Computing attitude and affect in text: Theory and applications, chapter Hedging in scientific articles as a means of classifying citations. Springer-Verlag, Dordrecht.Google ScholarGoogle Scholar
  6. C. Friedman, P. Alderson, J. Austin, J. J. Cimino, and S. B. Johnson. 1994. A general natural--language text processor for clinical radiology. JAMIA, 1(2):161--174.Google ScholarGoogle Scholar
  7. K. Hyland. 1998. Hedging in scientific research articles. John Benjamins B.V, Amsterdam.Google ScholarGoogle Scholar
  8. T. Joachims, 1999. Advances in Kernel Methods - Support Vector Learning, chapter Making large-Scale SVM Learning Practical, pages 169--184. MIT-Press, Cambridge, MA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. H. Kilicoglu and S. Bergler. 2008. Recognizing speculative language in biomedical research articles: a linguistically motivated perspective. BMC Bioinformatics, 9(Suppl 11):S10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. M. Krallinger, F. Leitner, C. Rodriguez-Penagos, and A. Valencia. 2008a. Overview of the protein--protein interaction annotation extraction task of BioCreative II. Genome Biology, 9(Suppl 2):S4.Google ScholarGoogle ScholarCross RefCross Ref
  11. M. Krallinger, A. Valencia, and L. Hirschman. 2008b. Linking genes to literature: text mining, information extraction, and retrieval applications for biology. Genome Biology, 9(Suppl 2):S8.Google ScholarGoogle ScholarCross RefCross Ref
  12. M. Krauthammer, P. Kra, I. Iossifov, S. M. Gomez, G. Hripcsak, V. Hatzivassiloglou, C. Friedman, and A. Rzhetsky. 2002. Of truth and pathways: chasing bits of information through myriads of articles. Bioinformatics, 18(Suppl 1):S249--57.Google ScholarGoogle ScholarCross RefCross Ref
  13. J. Lafferty, A. McCallum, and F. Pereira. 2001. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proc. of ICML 2001, pages 282--289. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. G. Lakoff. 1972. Hedges: a study in meaning criteria and the logic of fuzzy concepts. Chicago Linguistics Society Papers, 8:183--228.Google ScholarGoogle Scholar
  15. M. Light, X. Y. Qiu, and P. Srinivasan. 2004. The language of bioscience: facts, speculations, and statements in between. In Proc. of the BioLINK 2004, pages 17--24.Google ScholarGoogle Scholar
  16. B. Medlock and T. Briscoe. 2007. Weakly supervised learning for hedge classification in scientific literature. In Proc. of ACL 2007, pages 992--999.Google ScholarGoogle Scholar
  17. B. Medlock. 2008. Exploring hedge identification in biomedical literature. JBI, 41:636--654. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. T. Mitsumori, M. Murata, Y. Fukuda, K Doi, and H. Doi. 2006. Extracting protein-protein interaction information from biomedical text with svm. IEICE - Trans. Inf. Syst., E89-D(8):2464--2466. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. R. Morante and W. Daelemans. 2009. A metalearning approach to processing the scope of negation. In Proc. of CoNLL 2009, Boulder, Colorado. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. F. R. Palmer. 1986. Mood and modality. CUP, Cambridge, UK.Google ScholarGoogle Scholar
  21. R. Saurí, M. Verhagen, and J. Pustejovsky. 2006. Annotating and recognizing event modality in text. In Proc. of FLAIRS 2006, pages 333--339.Google ScholarGoogle Scholar
  22. G. Szarvas, V. Vincze, R. Farkas, and J. Csirik. 2008. The BioScope corpus: annotation for negation, uncertainty and their scope in biomedical texts. In Proc. of BioNLP 2008, pages 38--45, Columbus, Ohio. ACL. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. G. Szarvas. 2008. Hedge classification in biomedical texts with a weakly supervised selection of keywords. In Proc. of ACL 2008, pages 281--289, Columbus, Ohio, USA. ACL.Google ScholarGoogle Scholar
  24. P. Thompson, G. Venturi, J. McNaught, S. Montemagni, and S. Ananiadou. 2008. Categorising modality in biomedical texts. In Proc. of the LREC 2008 Workshop on Building and Evaluating Resources for Biomedical Text Mining 2008, pages 27--34, Marrakech. LREC.Google ScholarGoogle Scholar
  25. Y. Tsuruoka and J. Tsujii. 2005. Bidirectional inference with the easiest-first strategy for tagging sequence data. In Proc. of HLT/EMNLP 2005, pages 467--474. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Y. Tsuruoka, Y. Tateishi, J. Kim, T. Ohta, J. McNaught, S. Ananiadou, and J. Tsujii, 2005. Advances in Informatics - 10th Panhellenic Conference on Informatics, volume 3746 of LNCS, chapter Part-of-Speech Tagger for Biomedical Text, Advances in Informatics, pages 382--392. Springer, Berlin/Heidelberg. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. C. J. Van Rijsbergen. 1979. Information Retrieval. Butterworths, London. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. V. Vincze, G. Szarvas, R. Farkas, G. Móra, and J. Csirik. 2008. The BioScope corpus: biomedical texts annotated for uncertainty, negation and their scopes. BMC Bioinformatics, 9(Suppl 11):S9.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Learning the scope of hedge cues in biomedical texts

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image DL Hosted proceedings
            BioNLP '09: Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing
            June 2009
            214 pages
            ISBN:9781932432305
            • Conference Chairs:
            • Kevin Bretonnel Cohen,
            • Dina Demner-Fushman,
            • Sophia Ananiadou,
            • John Pestian,
            • Jun'ichi Tsujii,
            • Bonnie Webber

            Publisher

            Association for Computational Linguistics

            United States

            Publication History

            • Published: 4 June 2009

            Qualifiers

            • research-article

            Acceptance Rates

            BioNLP '09 Paper Acceptance Rate12of29submissions,41%Overall Acceptance Rate22of63submissions,35%

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader