research-article

Free Access

Using citations to generate surveys of scientific paradigms

Authors:
Saif Mohammad

University of Maryland and Human Language Technology Center of Excellence

University of Maryland and Human Language Technology Center of Excellence
View Profile

,
Bonnie Dorr

University of Maryland and Human Language Technology Center of Excellence

University of Maryland and Human Language Technology Center of Excellence
View Profile

,
Melissa Egan

University of Maryland

University of Maryland
View Profile

,
Ahmed Hassan

University of Michigan

University of Michigan
View Profile

,
Pradeep Muthukrishan

University of Michigan

University of Michigan
View Profile

,
Vahed Qazvinian

University of Michigan

University of Michigan
View Profile

,
Dragomir Radev

University of Michigan

University of Michigan
View Profile

,
David Zajic

University of Maryland and Center for Advanced Study of Language

University of Maryland and Center for Advanced Study of Language
View Profile

NAACL '09: Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational LinguisticsMay 2009Pages 584–592

Published:31 May 2009Publication History

NAACL '09: Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics

Pages 584–592

ABSTRACT

The number of research publications in various disciplines is growing exponentially. Researchers and scientists are increasingly finding themselves in the position of having to quickly understand large amounts of technical material. In this paper we present the first steps in producing an automatically generated, readily consumable, technical survey. Specifically we explore the combination of citation information and summarization techniques. Even though prior work (Teufel et al., 2006) argues that citation text is unsuitable for summarization, we show that in the framework of multi-document survey creation, citation texts can play a crucial role.

References

Shannon Bradshaw. 2003. Reference directed indexing: Redeeming relevance for subject search in citation indexes. In Proceedings of the 7th European Conference on Research and Advanced Technology for Digital Libraries.Google ScholarCross Ref
Jaime G. Carbonell and Jade Goldstein. 1998. The use of mmr, diversity-based reranking for reordering documents and producing summaries. In Proceedings of 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 335--336, Melbourne, Australia. Google ScholarDigital Library
Aaron Elkiss, Siwei Shen, Anthony Fader, Güneş Erkan, David States, and Dragomir R. Radev. 2008a. Blind men and elephants: What do citation summaries tell us about a research article? Journal of the American Society for Information Science and Technology, 59(1):51--62. Google ScholarDigital Library
Aaron Elkiss, Siwei Shen, Anthony Fader, Güneş Erkan, David States, and Dragomir R. Radev. 2008b. Blind men and elephants: What do citation summaries tell us about a research article? Journal of the American Society for Information Science and Technology, 59(1):51--62. Google ScholarDigital Library
Güneş Erkan and Dragomir R. Radev. 2004. Lexrank: Graph-based centrality as salience in text summarization. Journal of Artificial Intelligence Research. Google ScholarDigital Library
Wesley Hildebrandt, Boris Katz, and Jimmy Lin. 2004. Overview of the trec 2003 question-answering track. In Proceedings of the 2004 Human Language Technology Conference and the North American Chapter of the Association for Computational Linguistics Annual Meeting (HLT/NAACL 2004).Google Scholar
Mark Joseph and Dragomir Radev. 2007. Citation analysis, centrality, and the ACL Anthology. Technical Report CSE-TR-535-07, University of Michigan. Dept. of Electrical Engineering and Computer Science.Google Scholar
Daniel Jurafsky and James H. Martin. 2008. Speech and Language Processing: An Introduction to Natural Language Processing, Speech Recognition, and Computational Linguistics (2nd edition). Prentice-Hall. Google ScholarDigital Library
Min-Yen Kan, Judith L. Klavans, and Kathleen R. McKeown. 2002. Using the Annotated Bibliography as a Resource for Indicative Summarization. In Proceedings of LREC 2002, Las Palmas, Spain.Google Scholar
Dan Klein and Christopher D. Manning. 2003. Accurate unlexicalized parsing. In Proceedings of the 41st Meeting of ACL, pages 423--430. Google ScholarDigital Library
Julian Kupiec, Jan Pedersen, and Francine Chen. 1995. A trainable document summarizer. In SIGIR '95, pages 68--73, New York, NY, USA. ACM. Google ScholarDigital Library
Amy Langville and Carl Meyer. 2006. Google's PageRank and Beyond: The Science of Search Engine Rankings. Princeton University Press. Google ScholarDigital Library
Jimmy J. Lin and Dina Demner-Fushman. 2006. Methods for automatically evaluating answers to complex questions. Information Retrieval, 9(5):565--587. Google ScholarDigital Library
Chin-Yew Lin. 2004. Rouge: A package for automatic evaluation of summaries. In Proceedings of the ACL workshop on Text Summarization Branches Out.Google Scholar
Qiaozhu Mei and ChengXiang Zhai. 2008. Generating impact-based summaries for scientific literature. In Proceedings of ACL '08, pages 816--824.Google Scholar
Preslav I. Nakov, Schwartz S. Ariel, and Hearst A. Marti. 2004. Citances: Citation sentences for semantic analysis of bioscience text. In Workshop on Search and Discovery in Bioinformatics.Google Scholar
Hidetsugu Nanba and Manabu Okumura. 1999. Towards multi-paper summarization using reference information. In IJCAI1999, pages 926--931. Google ScholarDigital Library
Hidetsugu Nanba, Takeshi Abekawa, Manabu Okumura, and Suguru Saito. 2004a. Bilingual presri: Integration of multiple research paper databases. In Proceedings of RIAO 2004, pages 195--211, Avignon, France.Google Scholar
Hidetsugu Nanba, Noriko Kando, and Manabu Okumura. 2004b. Classification of research papers using citation links and citation types: Towards automatic review article generation. In Proceedings of the 11th SIG Classification Research Workshop, pages 117--134, Chicago, USA.Google Scholar
Ani Nenkova and Rebecca Passonneau. 2004. Evaluating content selection in summarization: The pyramid method. Proceedings of the HLT-NAACL conference.Google Scholar
Mark E. J. Newman. 2001. The structure of scientific collaboration networks. PNAS, 98(2):404--409.Google ScholarCross Ref
Vahed Qazvinian and Dragomir R. Radev. 2008. Scientific paper summarization using citation summary networks. In COLING 2008, Manchester, UK. Google ScholarDigital Library
Advaith Siddharthan and Simone Teufel. 2007. Whose idea was this, and why does it matter? attributing scientific work to citations. In Proceedings of NAACL/HLT-07.Google Scholar
Simone Teufel and Marc Moens. 2002. Summarizing scientific articles: experiments with relevance and rhetorical status. Comput. Linguist., 28(4):409--445. Google ScholarDigital Library
Simone Teufel, Advaith Siddharthan, and Dan Tidhar. 2006. Automatic classification of citation function. In Proceedings of EMNLP, pages 103--110, Australia. Google ScholarDigital Library
Simone Teufel. 2005. Argumentative Zoning for Improved Citation Indexing. Computing Attitude and Affect in Text: Theory and Applications, pages 159--170.Google Scholar
Ellen M. Voorhees. 2003. Overview of the trec 2003 question answering track. In Proceedings of the Twelfth Text Retrieval Conference (TREC 2003).Google Scholar
David M. Zajic, Bonnie J. Dorr, Jimmy Lin, and Richard Schwartz. 2007. Multi-candidate reduction: Sentence compression as a tool for document summarization tasks. Information Processing and Management (Special Issue on Summarization). Google ScholarDigital Library

Index Terms

Using citations to generate surveys of scientific paradigms
1. Applied computing
  1. Arts and humanities
    1. Language translation
2. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Language resources

Recommendations

Detecting and analyzing missing citations to published scientific entities
Abstract
Proper citation is of great importance in academic writing for it enables knowledge accumulation and maintains academic integrity. However, citing properly is not an easy task. For published scientific entities, the ever-growing academic ...
Read More
A survey on sentiment analysis of scientific citations
Abstract
Sentiment analysis of scientific citations has received much attention in recent years because of the increased availability of scientific publications. Scholarly databases are valuable sources for publications and citation information where ...
Read More
Google Scholar citations and Google Web-URL citations: A multi-discipline exploratory analysis

We use a new data gathering method, “Web/URL citation,” Web/URL and Google Scholar to compare traditional and Web-based citation patterns across multiple disciplines (biology, chemistry, physics, computing, sociology, economics, psychology, and ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
NAACL '09: Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
May 2009
716 pages
ISBN:9781932432411
General Chair:
Mari Ostendorf
University of Washington
,
Program Chairs:
Michael Collins
Massachusetts Institute of Technology
,
Shri Narayanan
University of Southern California
,
Douglas W. Oard
Microsoft Research
Sponsors
In-Cooperation
Publisher
Association for Computational Linguistics
United States
Publication History
- Published: 31 May 2009
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate21of29submissions,72%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 24
  Total Citations
  View Citations
- 618
  Total Downloads
- Downloads (Last 12 months)28
- Downloads (Last 6 weeks)3
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Using citations to generate surveys of scientific paradigms

NAACL '09: Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics

ABSTRACT

References

Cited By

Index Terms

Recommendations

Detecting and analyzing missing citations to published scientific entities

A survey on sentiment analysis of scientific citations

Google Scholar citations and Google Web-URL citations: A multi-discipline exploratory analysis

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Using citations to generate surveys of scientific paradigms

NAACL '09: Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics

ABSTRACT

References

Cited By

Index Terms

Recommendations

Detecting and analyzing missing citations to published scientific entities

A survey on sentiment analysis of scientific citations

Google Scholar citations and Google Web-URL citations: A multi-discipline exploratory analysis

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media