research-article

Free Access

Learning the scope of hedge cues in biomedical texts

Authors:
Roser Morante

University of Antwerp, Antwerpen, Belgium

University of Antwerp, Antwerpen, Belgium
View Profile

,
Walter Daelemans

University of Antwerp, Antwerpen, Belgium

University of Antwerp, Antwerpen, Belgium
View Profile

Authors Info & Claims

BioNLP '09: Proceedings of the Workshop on Current Trends in Biomedical Natural Language ProcessingJune 2009Pages 28–36

Published:04 June 2009Publication History

BioNLP '09: Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing

Pages 28–36

ABSTRACT

Identifying hedged information in biomedical literature is an important subtask in information extraction because it would be misleading to extract speculative information as factual information. In this paper we present a machine learning system that finds the scope of hedge cues in biomedical texts. The system is based on a similar system that finds the scope of negation cues. We show that the same scope finding approach can be applied to both negation and hedging. To investigate the robustness of the approach, the system is tested on the three subcorpora of the BioScope corpus that represent different text types.

References

S. Buchholz and E. Marsi. 2006. CoNLL-X shared task on multilingual dependency parsing. In Proc. of the X CoNLL Shared Task, New York. SIGNLL. Google ScholarDigital Library
N. Collier, H. S. Park, N. Ogata, Y. Tateisi, C. Nobata, T. Sekimizu, H. Imai, and J. Tsujii. 1999. The GENIA project: corpus-based knowledge acquisition and information extraction from genome research papers. In Proc. of EACL 1999. Google ScholarDigital Library
T. M. Cover and P. E. Hart. 1967. Nearest neighbor pattern classification. Institute of Electrical and Electronics Engineers Transactions on Information Theory, 13:21--27.Google ScholarDigital Library
W. Daelemans, J. Zavrel, K. Van der Sloot, and A. Van den Bosch. 2007. TiMBL: Tilburg memory based learner, version 6.1, reference guide. Technical Report Series 07-07, ILK, Tilburg, The Netherlands.Google Scholar
C. Di Marco and R. E. Mercer, 2005. Computing attitude and affect in text: Theory and applications, chapter Hedging in scientific articles as a means of classifying citations. Springer-Verlag, Dordrecht.Google Scholar
C. Friedman, P. Alderson, J. Austin, J. J. Cimino, and S. B. Johnson. 1994. A general natural--language text processor for clinical radiology. JAMIA, 1(2):161--174.Google Scholar
K. Hyland. 1998. Hedging in scientific research articles. John Benjamins B.V, Amsterdam.Google Scholar
T. Joachims, 1999. Advances in Kernel Methods - Support Vector Learning, chapter Making large-Scale SVM Learning Practical, pages 169--184. MIT-Press, Cambridge, MA. Google ScholarDigital Library
H. Kilicoglu and S. Bergler. 2008. Recognizing speculative language in biomedical research articles: a linguistically motivated perspective. BMC Bioinformatics, 9(Suppl 11):S10. Google ScholarDigital Library
M. Krallinger, F. Leitner, C. Rodriguez-Penagos, and A. Valencia. 2008a. Overview of the protein--protein interaction annotation extraction task of BioCreative II. Genome Biology, 9(Suppl 2):S4.Google ScholarCross Ref
M. Krallinger, A. Valencia, and L. Hirschman. 2008b. Linking genes to literature: text mining, information extraction, and retrieval applications for biology. Genome Biology, 9(Suppl 2):S8.Google ScholarCross Ref
M. Krauthammer, P. Kra, I. Iossifov, S. M. Gomez, G. Hripcsak, V. Hatzivassiloglou, C. Friedman, and A. Rzhetsky. 2002. Of truth and pathways: chasing bits of information through myriads of articles. Bioinformatics, 18(Suppl 1):S249--57.Google ScholarCross Ref
J. Lafferty, A. McCallum, and F. Pereira. 2001. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proc. of ICML 2001, pages 282--289. Google ScholarDigital Library
G. Lakoff. 1972. Hedges: a study in meaning criteria and the logic of fuzzy concepts. Chicago Linguistics Society Papers, 8:183--228.Google Scholar
M. Light, X. Y. Qiu, and P. Srinivasan. 2004. The language of bioscience: facts, speculations, and statements in between. In Proc. of the BioLINK 2004, pages 17--24.Google Scholar
B. Medlock and T. Briscoe. 2007. Weakly supervised learning for hedge classification in scientific literature. In Proc. of ACL 2007, pages 992--999.Google Scholar
B. Medlock. 2008. Exploring hedge identification in biomedical literature. JBI, 41:636--654. Google ScholarDigital Library
T. Mitsumori, M. Murata, Y. Fukuda, K Doi, and H. Doi. 2006. Extracting protein-protein interaction information from biomedical text with svm. IEICE - Trans. Inf. Syst., E89-D(8):2464--2466. Google ScholarDigital Library
R. Morante and W. Daelemans. 2009. A metalearning approach to processing the scope of negation. In Proc. of CoNLL 2009, Boulder, Colorado. Google ScholarDigital Library
F. R. Palmer. 1986. Mood and modality. CUP, Cambridge, UK.Google Scholar
R. Saurí, M. Verhagen, and J. Pustejovsky. 2006. Annotating and recognizing event modality in text. In Proc. of FLAIRS 2006, pages 333--339.Google Scholar
G. Szarvas, V. Vincze, R. Farkas, and J. Csirik. 2008. The BioScope corpus: annotation for negation, uncertainty and their scope in biomedical texts. In Proc. of BioNLP 2008, pages 38--45, Columbus, Ohio. ACL. Google ScholarDigital Library
G. Szarvas. 2008. Hedge classification in biomedical texts with a weakly supervised selection of keywords. In Proc. of ACL 2008, pages 281--289, Columbus, Ohio, USA. ACL.Google Scholar
P. Thompson, G. Venturi, J. McNaught, S. Montemagni, and S. Ananiadou. 2008. Categorising modality in biomedical texts. In Proc. of the LREC 2008 Workshop on Building and Evaluating Resources for Biomedical Text Mining 2008, pages 27--34, Marrakech. LREC.Google Scholar
Y. Tsuruoka and J. Tsujii. 2005. Bidirectional inference with the easiest-first strategy for tagging sequence data. In Proc. of HLT/EMNLP 2005, pages 467--474. Google ScholarDigital Library
Y. Tsuruoka, Y. Tateishi, J. Kim, T. Ohta, J. McNaught, S. Ananiadou, and J. Tsujii, 2005. Advances in Informatics - 10th Panhellenic Conference on Informatics, volume 3746 of LNCS, chapter Part-of-Speech Tagger for Biomedical Text, Advances in Informatics, pages 382--392. Springer, Berlin/Heidelberg. Google ScholarDigital Library
C. J. Van Rijsbergen. 1979. Information Retrieval. Butterworths, London. Google ScholarDigital Library
V. Vincze, G. Szarvas, R. Farkas, G. Móra, and J. Csirik. 2008. The BioScope corpus: biomedical texts annotated for uncertainty, negation and their scopes. BMC Bioinformatics, 9(Suppl 11):S9.Google ScholarCross Ref

Index Terms

Learning the scope of hedge cues in biomedical texts
1. Applied computing
  1. Life and medical sciences
    1. Consumer health
    2. Health informatics
2. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
  2. Machine learning
    1. Learning settings

Recommendations

Detecting hedge cues and their scope in biomedical text with conditional random fields

Objective: Hedging is frequently used in both the biological literature and clinical notes to denote uncertainty or speculation. It is important for text-mining applications to detect hedge cues and their scope; otherwise, uncertain events are ...
Read More
Recognizing names in biomedical texts: a machine learning approach

Motivation: With an overwhelming amount of textual information in molecular biology and biomedicine, there is a need for effective and efficient literature mining and knowledge discovery that can help biologists to gather and make use of the ...
Read More
Recognizing named entities in biomedical texts
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
BioNLP '09: Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing
June 2009
214 pages
ISBN:9781932432305
Conference Chairs:
Kevin Bretonnel Cohen
Center for Computational Pharmacology, University of Colorado School of Medicine and The MITRE Corporation
,
Dina Demner-Fushman
Lister Hill National Center for Biomedical Communications, US National Library of Medicine
,
Sophia Ananiadou
University of Manchester and UK National Centre for Text Mining
,
John Pestian
Computational Medicine Center, University of Cincinnati, Cincinnati Children's Hospital Medical Center
,
Jun'ichi Tsujii
University of Tokyo and UK National Centre for Text Mining
,
Bonnie Webber
University of Edinburgh
Sponsors
In-Cooperation
Publisher
Association for Computational Linguistics
United States
Publication History
- Published: 4 June 2009
Qualifiers
- research-article
Conference

Acceptance Rates
BioNLP '09 Paper Acceptance Rate12of29submissions,41%Overall Acceptance Rate22of63submissions,35%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 34
  Total Citations
  View Citations
- 439
  Total Downloads
- Downloads (Last 12 months)25
- Downloads (Last 6 weeks)4
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Learning the scope of hedge cues in biomedical texts

BioNLP '09: Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing

ABSTRACT

References

Cited By

Index Terms

Recommendations

Detecting hedge cues and their scope in biomedical text with conditional random fields

Recognizing names in biomedical texts: a machine learning approach

Recognizing named entities in biomedical texts

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Learning the scope of hedge cues in biomedical texts

BioNLP '09: Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing

ABSTRACT

References

Cited By

Index Terms

Recommendations

Detecting hedge cues and their scope in biomedical text with conditional random fields

Recognizing names in biomedical texts: a machine learning approach

Recognizing named entities in biomedical texts

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media