ABSTRACT
We present an automatic multi-document summarization system for Dutch based on the MEAD system. We focus on redundancy detection, an essential ingredient of multi-document summarization. We introduce a semantic overlap detection tool, which goes beyond simple string matching. Our results so far do not confirm our expectation that this tool would outperform the other tested methods.
- Regina Barzilay and Kathleen R. McKeown. 2005. Sentence fusion for multidocument news summarization. Computational Linguistics, 31(3):297--328. Google ScholarCross Ref
- Gosse Bouma, Gertjan van Noord, and Robert Malouf. 2001. Alpino: Wide-coverage computational analysis of Dutch. In Computational Linguistics in the Netherlands 2000., pages 45--59. Rodopi, Amsterdam, New York.Google Scholar
- Jaime Carbonell and Jade Goldstein. 1998. The use of mmr, diversity-based reranking for reordering documents and producing summaries. In Proceedings of SIGIR 1998, pages 335--336, New York, NY, USA. ACM. Google ScholarDigital Library
- H. T. Dang. 2006. Overview of DUC 2006. In Proceedings of the Document Understanding Workshop, pages 1--10, Brooklyn, USA.Google Scholar
- Harold W. Kuhn. 1955. The Hungarian Method for the assignment problem. Naval Research Logistics Quarterly, 2:83--97.Google ScholarCross Ref
- C.-Y. Lin and E. H. Hovy. 2003. Automatic evaluation of summaries using n-gram co-occurrence statistics. In Proceedings of HLT-NAACL, pages 71--78, Edmonton, Canada. Google ScholarDigital Library
- D. Lin. 1998. An information-theoretic definition of similarity. In Proceedings of the ICML, pages 296--304. Google ScholarDigital Library
- Erwin Marsi and Emiel Krahmer. 2007. Annotating a parallel monolingual treebank with semantic similarity relations. In Proceedings of the 6th International Workshop on Treebanks and Linguistic Theories, pages 85--96, Bergen, Norway.Google Scholar
- Dragomir Radev et al. 2004. Mead-a platform for multidocument multilingual text summarization. In Proceedings of LREC 2004, Lisabon, Portugal.Google Scholar
- Karen Spärck Jones. 1972. A statistical interpretation of term specificity and its application in retrieval. Journal of Documentation, 28(1):11--21.Google ScholarCross Ref
- P. Vossen, I. Maks, R. Segers, and H. van der Vliet. 2008. Integrating lexical units, synsets and ontology in the Cornetto Database. In Proceedings of LREC 2008, Marrakech, Morocco.Google Scholar
Index Terms
- Reducing redundancy in multi-document summarization using lexical semantic similarity
Recommendations
Latent dirichlet allocation based multi-document summarization
AND '08: Proceedings of the second workshop on Analytics for noisy unstructured text dataExtraction based Multi-Document Summarization Algorithms consist of choosing sentences from the documents using some weighting mechanism and combining them into a summary. In this article we use Latent Dirichlet Allocation to capture the events being ...
Hybrid multi-document summarization using pre-trained language models
AbstractAbstractive multi-document summarization is a type of automatic text summarization. It obtains information from multiple documents and generates a human-like summary from them. In this paper, we propose an abstractive multi-document ...
Highlights- Introducing a multi-document summarizer, called HMSumm, based on pre-trained methods.
Research on Multi-document Summarization Based on LDA Topic Model
IHMSC '14: Proceedings of the 2014 Sixth International Conference on Intelligent Human-Machine Systems and Cybernetics - Volume 02Compared with VSM (Vector Space Model) and graph-ranking models, LDA (Latent Dirichlet Allocation) Model can discover latent topics in the corpus and latent topics are beneficial to use sentence-ranking mechanisms to form a good summary. In the paper, ...
Comments