skip to main content
10.5555/1708155.1708167dlproceedingsArticle/Chapter ViewAbstractPublication Pagesucnlg-sumConference Proceedingsconference-collections
research-article
Free Access

Reducing redundancy in multi-document summarization using lexical semantic similarity

Published:06 August 2009Publication History

ABSTRACT

We present an automatic multi-document summarization system for Dutch based on the MEAD system. We focus on redundancy detection, an essential ingredient of multi-document summarization. We introduce a semantic overlap detection tool, which goes beyond simple string matching. Our results so far do not confirm our expectation that this tool would outperform the other tested methods.

References

  1. Regina Barzilay and Kathleen R. McKeown. 2005. Sentence fusion for multidocument news summarization. Computational Linguistics, 31(3):297--328. Google ScholarGoogle ScholarCross RefCross Ref
  2. Gosse Bouma, Gertjan van Noord, and Robert Malouf. 2001. Alpino: Wide-coverage computational analysis of Dutch. In Computational Linguistics in the Netherlands 2000., pages 45--59. Rodopi, Amsterdam, New York.Google ScholarGoogle Scholar
  3. Jaime Carbonell and Jade Goldstein. 1998. The use of mmr, diversity-based reranking for reordering documents and producing summaries. In Proceedings of SIGIR 1998, pages 335--336, New York, NY, USA. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. H. T. Dang. 2006. Overview of DUC 2006. In Proceedings of the Document Understanding Workshop, pages 1--10, Brooklyn, USA.Google ScholarGoogle Scholar
  5. Harold W. Kuhn. 1955. The Hungarian Method for the assignment problem. Naval Research Logistics Quarterly, 2:83--97.Google ScholarGoogle ScholarCross RefCross Ref
  6. C.-Y. Lin and E. H. Hovy. 2003. Automatic evaluation of summaries using n-gram co-occurrence statistics. In Proceedings of HLT-NAACL, pages 71--78, Edmonton, Canada. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. D. Lin. 1998. An information-theoretic definition of similarity. In Proceedings of the ICML, pages 296--304. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Erwin Marsi and Emiel Krahmer. 2007. Annotating a parallel monolingual treebank with semantic similarity relations. In Proceedings of the 6th International Workshop on Treebanks and Linguistic Theories, pages 85--96, Bergen, Norway.Google ScholarGoogle Scholar
  9. Dragomir Radev et al. 2004. Mead-a platform for multidocument multilingual text summarization. In Proceedings of LREC 2004, Lisabon, Portugal.Google ScholarGoogle Scholar
  10. Karen Spärck Jones. 1972. A statistical interpretation of term specificity and its application in retrieval. Journal of Documentation, 28(1):11--21.Google ScholarGoogle ScholarCross RefCross Ref
  11. P. Vossen, I. Maks, R. Segers, and H. van der Vliet. 2008. Integrating lexical units, synsets and ontology in the Cornetto Database. In Proceedings of LREC 2008, Marrakech, Morocco.Google ScholarGoogle Scholar

Index Terms

  1. Reducing redundancy in multi-document summarization using lexical semantic similarity

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image DL Hosted proceedings
        UCNLG+Sum '09: Proceedings of the 2009 Workshop on Language Generation and Summarisation
        August 2009
        108 pages
        ISBN:9781932432510

        Publisher

        Association for Computational Linguistics

        United States

        Publication History

        • Published: 6 August 2009

        Qualifiers

        • research-article

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader