skip to main content
10.5555/1610195.1610199dlproceedingsArticle/Chapter ViewAbstractPublication PagesenlgConference Proceedingsconference-collections
research-article
Free Access

Is sentence compression an NLG task?

Published:30 March 2009Publication History

ABSTRACT

Data-driven approaches to sentence compression define the task as dropping any subset of words from the input sentence while retaining important information and grammaticality. We show that only 16% of the observed compressed sentences in the domain of subtitling can be accounted for in this way. We argue that part of this is due to evaluation issues and estimate that a deletion model is in fact compatible with approximately 55% of the observed data. We analyse the remaining problems and conclude that in those cases word order changes and paraphrasing are crucial, and argue for more elaborate sentence compression models which build on NLG work.

References

  1. Regina Barzilay and Lillian Lee. 2003. Learning to paraphrase: an unsupervised approach using multiple-sequence alignment. In Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, pages 16--23, Morristown, NJ, USA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Anja Belz and Ehud Reiter. 2006. Comparing automatic and human evaluation of NLG systems. In Proceedings of the 11th Conference of the European Chapter of the Association for Computational Linguistics, pages 313--320.Google ScholarGoogle Scholar
  3. James Clarke and Mirella Lapata. 2006. Models for sentence compression: a comparison across domains, training requirements and evaluation measures. In Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics, pages 377--384, Morristown, NJ, USA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. James Clarke and Mirella Lapata. 2008. Global inference for sentence compression an integer linear programming approach. Journal of Artificial Intelligence Research, 31:399--429. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Simon Corston-Oliver. 2001. Text compaction for display on very small screens. In Proceedings of the Workshop on Automatic Summarization (WAS 2001), pages 89--98, Pittsburgh, PA, USA.Google ScholarGoogle Scholar
  6. Walter Daelemans, Anita Höthker, and Erik Tjong Kim Sang. 2004. Automatic sentence simplification for subtitling in Dutch and English. In Proceedings of the 4th International Conference on Language Resources and Evaluation, pages 1045--1048.Google ScholarGoogle Scholar
  7. Bill Dolan, Chris Quirk, and Chris Brockett. 2004. Unsupervised construction of large paraphrase corpora: Exploiting massively parallel news sources. In Proceedings of the 20th International Conference on Computational Linguistics, pages 350--356, Morristown, NJ, USA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Ali Ibrahim, Boris Katz, and Jimmy Lin. 2003. Extracting structural paraphrases from aligned monolingual corpora. In Proceedings of the 2nd International Workshop on Paraphrasing, volume 16, pages 57--64, Sapporo, Japan. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Kentaro Inui, Takenobu Tokunaga, and Hozumi Tanaka. 1992. Text Revision: A Model and Its Implementation. In Proceedings of the 6th International Workshop on Natural Language Generation: Aspects of Automated Natural Language Generation, pages 215--230. Springer-Verlag London, UK. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Hongyan Jing and Kathleen McKeown. 2000. Cut and paste based text summarization. In Proceedings of the 1st Conference of the North American Chapter of the Association for Computational Linguistics, pages 178--185, San Francisco, CA, USA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Kevin Knight and Daniel Marcu. 2002. Summarization beyond sentence extraction: A probabilistic approach to sentence compression. Artificial Intelligence, 139(1):91--107. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Nguyen Minh Le and Susumu Horiguchi. 2003. A New Sentence Reduction based on Decision Tree Model. In Proceedings of the 17th Pacific Asia Conference on Language, Information and Computation, pages 290--297.Google ScholarGoogle Scholar
  13. Dekang Lin and Patrick Pantel. 2001. Discovery of inference rules for question answering. Natural Language Engineering, 7(4):343--360. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Chin-Yew Lin. 2003. Improving summarization performance by sentence compression - A pilot study. In Proceedings of the Sixth International Workshop on Information Retrieval with Asian Languages, volume 2003, pages 1--9. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Erwin Marsi and Emiel Krahmer. 2007. Annotating a parallel monolingual treebank with semantic similarity relations. In Proceedings of the 6th International Workshop on Treebanks and Linguistic Theories, pages 85--96, Bergen, Norway.Google ScholarGoogle Scholar
  16. Jenine Turner and Eugene Charniak. 2005. Supervised and unsupervised learning for sentence compression. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics, pages 290--297, Ann Arbor, Michigan, June. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Vincent Vandeghinste and Yi Pan. 2004. Sentence compression for automated subtitling: A hybrid approach. In Proceedings of the ACL Workshop on Text Summarization, pages 89--95.Google ScholarGoogle Scholar
  18. Vincent Vandeghinste and Erik Tsjong Kim Sang. 2004. Using a Parallel Transcript/Subtitle Corpus for Sentence Compression. In Proceedings of LREC 2004.Google ScholarGoogle Scholar
  19. David Zajic, Bonnie J. Dorr, Jimmy Lin, and Richard Schwartz. 2007. Multi-candidate reduction: Sentence compression as a tool for document summarization tasks. Information Processing Management, 43(6):1549--1570. Google ScholarGoogle ScholarDigital LibraryDigital Library
  1. Is sentence compression an NLG task?

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image DL Hosted proceedings
        ENLG '09: Proceedings of the 12th European Workshop on Natural Language Generation
        March 2009
        202 pages

        Publisher

        Association for Computational Linguistics

        United States

        Publication History

        • Published: 30 March 2009

        Qualifiers

        • research-article

        Acceptance Rates

        ENLG '09 Paper Acceptance Rate14of37submissions,38%Overall Acceptance Rate33of78submissions,42%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader