skip to main content
10.3115/1072228.1072338dlproceedingsArticle/Chapter ViewAbstractPublication PagescolingConference Proceedingsconference-collections
Article
Free Access

Notions of correctness when evaluating protein name taggers

Authors Info & Claims
Published:24 August 2002Publication History

ABSTRACT

This paper introduces four different notions of correctness to be used when measuring the performance of protein name taggers, each of which reflects certain characteristics of the tagger under evaluation. The discussion regarding the different notions is centered around the evaluation of two protein name taggers; Yapex, developed by the authors, and KeX developed by Fukuda et al. (1998). For the purpose of illustrating the difference between the ways of evaluation, both taggers are applied to a test corpus of 101 MEDLINE abstracts in which all occurrences of protein names have been marked up by domain experts.

References

  1. Amos Bairoch and Rolf Apweiler. 2000. The swiss-prot protein sequence database and its supplement trembl in 2000. Nucl. Acids. Res., 28:45--48.Google ScholarGoogle ScholarCross RefCross Ref
  2. Christian Blaschke, Miguel A. Andrade, Christos Ouzounis, and Alfonso Valencia. 1999. Automatic extraction of biological information from scientific text: protein---protein interactions. In Proceedings of the Seventh International Conference on Intelligent Systems for Molecular Biology (ISMB'99), pages 60--67, Heidelberg, Germany, August 6--10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Andrew Borthwick, John Sterling, Eugene Agichtein, and Ralph Grishman. 1998. Nyu: Description of the mene named entity system as used in muc-7. In Proceedings of the Seventh Message Understanding Conference (MUC-7), Fairfax, VA, USA, April 29 - May 1.Google ScholarGoogle Scholar
  4. Nigel Collier, Hyun Seok Park, Norihiro Ogata, Yuka Tateishi, Chikashi Nobata, Tomoko Ohta, Tateshi Sekimizu, Hisao Imai, Katsutoshi Ibushi, and Jun-ichi Tsujii. 1999. The genia project: corpus-based knowledge acquisition and information extraction from genome research papers. In Proceedings of the European Association for Computational Linguistics (EACL) conference. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Nigel Collier, Chikashi Nobata, and Jun-ichi Tsujii. 2000. Extracting the names of genes and gene products with a hidden markov model. In Proceedings of the 18th International Conference on Computational Linguistics (COLING-2000), pages 201--207, August. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Berry de Bruijn and Joel Martin. 2000. Protein name tagging. Presented as a poster at the Eighth International Conference on Intelligent Systems for Molecular Biology (ISMB'00).Google ScholarGoogle Scholar
  7. Ken-ichiro Fukuda, Tatsuhiko Tsunoda, Ayuchi Tamura, and Toshihisa Takagi. 1998. Toward information extraction: Identifying protein names from biological papers. In Proceedings of the Pacific Symposium on Biocomputing (PSB'98), pages 705--716, Maui, Hawaii, January 4--9.Google ScholarGoogle Scholar
  8. Kevin Humphreys, George Demetriou, and Robert Gaizauskas. 2000. Two applications of information extraction to biological science journal articles: Enzyme interactions and protein structures. In Proceedings of the 5th Pacific Symposium of Biocomputing, pages 72--80.Google ScholarGoogle Scholar
  9. Chikashi Nobata, Nigel Collier, and Jun-ichi Tsujii. 1999. Automatic term identification and classification in biology texts. In Proceedings of the Natural Language Pacific Rim Symposium (NLPRS'2000), pages 369--374, November.Google ScholarGoogle Scholar
  10. Pasi Tapanainen and Timo Järvinen. 1997. A non-projective dependency parser. In Proceedings of the 5th Conference on Applied Natural Language Processing, pages 64--71, Washington D.C., April. Association for Computational Linguistics. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. James Thomas, David Milward, Chirtos Ouzounis, Stephen Pulman, and Mark Carroll. 2000. Automatic extraction of protein interactions from scientific abstracts. In Proceedings of the Pacific Symposium on Bio-computing (PSB 2000), pages 538--549, Oahu, Hawaii, January 4--9.Google ScholarGoogle Scholar

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image DL Hosted proceedings
    COLING '02: Proceedings of the 19th international conference on Computational linguistics - Volume 1
    August 2002
    1184 pages

    Publisher

    Association for Computational Linguistics

    United States

    Publication History

    • Published: 24 August 2002

    Qualifiers

    • Article

    Acceptance Rates

    Overall Acceptance Rate1,537of1,537submissions,100%

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader