research-article

Free Access

Is sentence compression an NLG task?

Authors:
Erwin Marsi

Tilburg University, Tilburg, The Netherlands

Tilburg University, Tilburg, The Netherlands
View Profile

,
Emiel Krahmer

Tilburg University, Tilburg, The Netherlands

Tilburg University, Tilburg, The Netherlands
View Profile

,
Iris Hendrickx

Antwerp University, Antwerpen, Belgium

Antwerp University, Antwerpen, Belgium
View Profile

,
Walter Daelemans

Antwerp University, Antwerpen, Belgium

Antwerp University, Antwerpen, Belgium
View Profile

Authors Info & Claims

ENLG '09: Proceedings of the 12th European Workshop on Natural Language GenerationMarch 2009Pages 25–32

Published:30 March 2009Publication History

ENLG '09: Proceedings of the 12th European Workshop on Natural Language Generation

Pages 25–32

ABSTRACT

Data-driven approaches to sentence compression define the task as dropping any subset of words from the input sentence while retaining important information and grammaticality. We show that only 16% of the observed compressed sentences in the domain of subtitling can be accounted for in this way. We argue that part of this is due to evaluation issues and estimate that a deletion model is in fact compatible with approximately 55% of the observed data. We analyse the remaining problems and conclude that in those cases word order changes and paraphrasing are crucial, and argue for more elaborate sentence compression models which build on NLG work.

References

Regina Barzilay and Lillian Lee. 2003. Learning to paraphrase: an unsupervised approach using multiple-sequence alignment. In Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, pages 16--23, Morristown, NJ, USA. Google ScholarDigital Library
Anja Belz and Ehud Reiter. 2006. Comparing automatic and human evaluation of NLG systems. In Proceedings of the 11th Conference of the European Chapter of the Association for Computational Linguistics, pages 313--320.Google Scholar
James Clarke and Mirella Lapata. 2006. Models for sentence compression: a comparison across domains, training requirements and evaluation measures. In Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics, pages 377--384, Morristown, NJ, USA. Google ScholarDigital Library
James Clarke and Mirella Lapata. 2008. Global inference for sentence compression an integer linear programming approach. Journal of Artificial Intelligence Research, 31:399--429. Google ScholarDigital Library
Simon Corston-Oliver. 2001. Text compaction for display on very small screens. In Proceedings of the Workshop on Automatic Summarization (WAS 2001), pages 89--98, Pittsburgh, PA, USA.Google Scholar
Walter Daelemans, Anita Höthker, and Erik Tjong Kim Sang. 2004. Automatic sentence simplification for subtitling in Dutch and English. In Proceedings of the 4th International Conference on Language Resources and Evaluation, pages 1045--1048.Google Scholar
Bill Dolan, Chris Quirk, and Chris Brockett. 2004. Unsupervised construction of large paraphrase corpora: Exploiting massively parallel news sources. In Proceedings of the 20th International Conference on Computational Linguistics, pages 350--356, Morristown, NJ, USA. Google ScholarDigital Library
Ali Ibrahim, Boris Katz, and Jimmy Lin. 2003. Extracting structural paraphrases from aligned monolingual corpora. In Proceedings of the 2nd International Workshop on Paraphrasing, volume 16, pages 57--64, Sapporo, Japan. Google ScholarDigital Library
Kentaro Inui, Takenobu Tokunaga, and Hozumi Tanaka. 1992. Text Revision: A Model and Its Implementation. In Proceedings of the 6th International Workshop on Natural Language Generation: Aspects of Automated Natural Language Generation, pages 215--230. Springer-Verlag London, UK. Google ScholarDigital Library
Hongyan Jing and Kathleen McKeown. 2000. Cut and paste based text summarization. In Proceedings of the 1st Conference of the North American Chapter of the Association for Computational Linguistics, pages 178--185, San Francisco, CA, USA. Google ScholarDigital Library
Kevin Knight and Daniel Marcu. 2002. Summarization beyond sentence extraction: A probabilistic approach to sentence compression. Artificial Intelligence, 139(1):91--107. Google ScholarDigital Library
Nguyen Minh Le and Susumu Horiguchi. 2003. A New Sentence Reduction based on Decision Tree Model. In Proceedings of the 17th Pacific Asia Conference on Language, Information and Computation, pages 290--297.Google Scholar
Dekang Lin and Patrick Pantel. 2001. Discovery of inference rules for question answering. Natural Language Engineering, 7(4):343--360. Google ScholarDigital Library
Chin-Yew Lin. 2003. Improving summarization performance by sentence compression - A pilot study. In Proceedings of the Sixth International Workshop on Information Retrieval with Asian Languages, volume 2003, pages 1--9. Google ScholarDigital Library
Erwin Marsi and Emiel Krahmer. 2007. Annotating a parallel monolingual treebank with semantic similarity relations. In Proceedings of the 6th International Workshop on Treebanks and Linguistic Theories, pages 85--96, Bergen, Norway.Google Scholar
Jenine Turner and Eugene Charniak. 2005. Supervised and unsupervised learning for sentence compression. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics, pages 290--297, Ann Arbor, Michigan, June. Google ScholarDigital Library
Vincent Vandeghinste and Yi Pan. 2004. Sentence compression for automated subtitling: A hybrid approach. In Proceedings of the ACL Workshop on Text Summarization, pages 89--95.Google Scholar
Vincent Vandeghinste and Erik Tsjong Kim Sang. 2004. Using a Parallel Transcript/Subtitle Corpus for Sentence Compression. In Proceedings of LREC 2004.Google Scholar
David Zajic, Bonnie J. Dorr, Jimmy Lin, and Richard Schwartz. 2007. Multi-candidate reduction: Sentence compression as a tool for document summarization tasks. Information Processing Management, 43(6):1549--1570. Google ScholarDigital Library

Is sentence compression an NLG task?
1. Computing methodologies
  1. Artificial intelligence
2. Hardware
  1. Power and energy
    1. Power estimation and optimization

Recommendations

Paraphrastic sentence compression with a character-based metric: tightening without deletion
MTTG '11: Proceedings of the Workshop on Monolingual Text-To-Text Generation

We present a substitution-only approach to sentence compression which "tightens" a sentence by reducing its character length. Replacing phrases with shorter paraphrases yields paraphrastic compressions as short as 60% of the original length. In support ...
Read More
SemEval-2010 task 3: cross-lingual word sense disambiguation
SEW '09: Proceedings of the Workshop on Semantic Evaluations: Recent Achievements and Future Directions

We propose a multilingual unsupervised Word Sense Disambiguation (WSD) task for a sample of English nouns. Instead of providing manually sensetagged examples for each sense of a polysemous noun, our sense inventory is built up on the basis of the ...
Read More
Discourse segmentation for sentence compression
MICAI'11: Proceedings of the 10th Mexican international conference on Advances in Artificial Intelligence - Volume Part I

Earlier studies have raised the possibility of summarizing at the level of the sentence. This simplification should help in adapting textual content in a limited space. Therefore, sentence compression is an important resource for automatic summarization ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ENLG '09: Proceedings of the 12th European Workshop on Natural Language Generation
March 2009
202 pages
Conference Chairs:
Emiel Krahmer
Tilburg University (The Netherlands)
,
Mariët Theune
University of Twente (The Netherlands)
Sponsors
In-Cooperation
Publisher
Association for Computational Linguistics
United States
Publication History
- Published: 30 March 2009
Qualifiers
- research-article
Conference

Acceptance Rates
ENLG '09 Paper Acceptance Rate14of37submissions,38%Overall Acceptance Rate33of78submissions,42%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 185
  Total Downloads
- Downloads (Last 12 months)21
- Downloads (Last 6 weeks)4
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Is sentence compression an NLG task?

ENLG '09: Proceedings of the 12th European Workshop on Natural Language Generation

ABSTRACT

References

Cited By

Recommendations

Paraphrastic sentence compression with a character-based metric: tightening without deletion

SemEval-2010 task 3: cross-lingual word sense disambiguation

Discourse segmentation for sentence compression

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Is sentence compression an NLG task?

ENLG '09: Proceedings of the 12th European Workshop on Natural Language Generation

ABSTRACT

References

Cited By

Recommendations

Paraphrastic sentence compression with a character-based metric: tightening without deletion

SemEval-2010 task 3: cross-lingual word sense disambiguation

Discourse segmentation for sentence compression

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media