research-article

Free Access

Semi-automated named entity annotation

Authors:
Kuzman Ganchev

University of Pennsylvania, Philadelphia, PA

University of Pennsylvania, Philadelphia, PA
View Profile

,
Fernando Pereira

University of Pennsylvania, Philadelphia, PA

University of Pennsylvania, Philadelphia, PA
View Profile

,
Mark Mandel

University of Pennsylvania, Philadelphia, PA

University of Pennsylvania, Philadelphia, PA
View Profile

,
Steven Carroll

Children's Hospital of Philadelphia, Philadelphia, PA

Children's Hospital of Philadelphia, Philadelphia, PA
View Profile

,
Peter White

Children's Hospital of Philadelphia, Philadelphia, PA

Children's Hospital of Philadelphia, Philadelphia, PA
View Profile

Authors Info & Claims

LAW '07: Proceedings of the Linguistic Annotation WorkshopJune 2007Pages 53–56

Published:28 June 2007Publication History

LAW '07: Proceedings of the Linguistic Annotation Workshop

Pages 53–56

ABSTRACT

We investigate a way to partially automate corpus annotation for named entity recognition, by requiring only binary decisions from an annotator. Our approach is based on a linear sequence model trained using a k-best MIRA learning algorithm. We ask an annotator to decide whether each mention produced by a high recall tagger is a true mention or a false positive. We conclude that our approach can reduce the effort of extending a seed training corpus by up to 58%.

References

Fu-Dong Chiou, David Chiang, and Martha Palmer. 2001. Facilitating treebank annotation using a statistical parser. In HLT '01. ACL. Google ScholarDigital Library
Wen-Chi Chou, Richard Tzong-Han Tsai, Ying-Shan Su, Wei Ku, Ting-Yi Sung, and Wen-Lian Hsu. 2006. A semi-automatic method for annotating a biomedical proposition bank. In FLAC'06. ACL. Google ScholarDigital Library
Koby Crammer, Ofer Dekel, Joseph Keshet, Shai Shalev-Shwartz, and Yoram Singer. 2006. Online passive-aggressive algorithms. JMLR, 7. Google ScholarDigital Library
Aron Culota, Trausti Kristjansson, Andrew McCallum, and Paul Viola. 2006. Corrective feedback and persistent learning for information extraction. Artificial Intelligence, 170:1101--1122. Google ScholarDigital Library
Mitchell P. Marcus, Beatrice Santorini, and Mary Ann Marcinkiewicz. 1994. Building a large annotated corpus of english: The penn treebank. Computational Linguistics, 19(2):313--330. Google ScholarDigital Library
Ryan McDonald, Koby Crammer, and Fernando Pereira. 2005. Online large-margin training of dependency parsers. In ACL'05. ACL. Google ScholarDigital Library
Lance Ramshaw and Mitch Marcus. 1995. Text chunking using transformation-based learning. In David Yarovsky and Kenneth Church, editors, Proceedings of the Third Workshop on Very Large Corpora. ACL.Google Scholar
Lorraine Tanabe, Natalie Xie, Lynne H. Thom, Wayne Matten, and W. John Wilbur. 2005. GENETAG: a tagged corpus for gene/protein named entity recognition. BMC Bioinformatics, 6(Suppl. 1).Google Scholar
Nianwen Xue, Fu-Dong Chiou, and Martha Palmer. 2002. Building a large-scale annotated chinese corpus. In Proceedings of the 19th international conference on Computational linguistics. ACL. Google ScholarDigital Library
Alexander Yeh, Alexander Morgan, Marc Colosimo, and Lynette Hirschman. 2005. BioCreAtIvE Task 1A: gene mention finding evaluation. BMC Bioinformatics, 6(Suppl. 1).Google Scholar

Semi-automated named entity annotation
1. Computing methodologies
  1. Artificial intelligence
2. Hardware
  1. Power and energy
    1. Power estimation and optimization

Recommendations

Comparison of Methods to Annotate Named Entity Corpora

The authors compared two methods for annotating a corpus for the named entity (NE) recognition task using non-expert annotators: (i) revising the results of an existing NE recognizer and (ii) manually annotating the NEs completely. The annotation time, ...
Read More
Exploring entity relations for named entity disambiguation
HLT-SS '11: Proceedings of the ACL 2011 Student Session

Named entity disambiguation is the task of linking an entity mention in a text to the correct real-world referent predefined in a knowledge base, and is a crucial subtask in many areas like information retrieval or topic detection and tracking. Named ...
Read More
Semi-supervised named entity recognition: learning to recognize 100 entity types with little supervision
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
LAW '07: Proceedings of the Linguistic Annotation Workshop
June 2007
210 pages
Program Chairs:
Branimir Boguraev
IBM T. J. Watson Research Center
,
Nancy Ide
Vassar College
,
Adam Meyers
New York University
,
Shigeko Nariyama
University of Melbourne
,
Manfred Stede
University of Potsdam
,
Janyce Wiebe
University of Pittsburgh
,
Graham Wilcock
University of Helsinki
Sponsors
In-Cooperation
Publisher
Association for Computational Linguistics
United States
Publication History
- Published: 28 June 2007
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 2
  Total Citations
  View Citations
- 334
  Total Downloads
- Downloads (Last 12 months)25
- Downloads (Last 6 weeks)2
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Semi-automated named entity annotation

LAW '07: Proceedings of the Linguistic Annotation Workshop

ABSTRACT

References

Cited By

Recommendations

Comparison of Methods to Annotate Named Entity Corpora

Exploring entity relations for named entity disambiguation

Semi-supervised named entity recognition: learning to recognize 100 entity types with little supervision

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Semi-automated named entity annotation

LAW '07: Proceedings of the Linguistic Annotation Workshop

ABSTRACT

References

Cited By

Recommendations

Comparison of Methods to Annotate Named Entity Corpora

Exploring entity relations for named entity disambiguation

Semi-supervised named entity recognition: learning to recognize 100 entity types with little supervision

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media