Abstract
It is necessary to determine a proper arrangement of extracted sentences to generate a well-organized summary from multiple documents. This paper describes our Multi-Document Summarization (MDS) system for TSC-3. It specifically addresses an approach to coherent sentence ordering for MDS. An impediment to the use of chronological ordering, which is widely used by conventional summarization system, is that it arranges sentences without considering the presupposed information of each sentence. We propose a method to improve chronological ordering by resolving precedent information of arranging sentences. Combining the refinement algorithm with topical segmentation and chronological ordering, we address our experiments and metrics to test the effectiveness of MDS tasks. Results demonstrate that the proposed method significantly improves chronological sentence ordering. At the end of the paper, we also report an outline/evaluation of important sentence extraction and redundant clause elimination integrated in our MDS system.
- Barzilay, R. and Lee, L. 2004. Catching the drift: Probabilistic content models, with applications to generation and summarization. In HLT-NAACL 2004: Proceedings of the Main Conference. 113--120.Google Scholar
- Barzilay, R., Elhadad, E., and McKeown, K. 2002. Inferring strategies for sentence ordering in multidocument summarization. Journal of Artificial Intelligence Research (JAIR) 17, 35--55. Google Scholar
- Carbonell, J. and Goldstein, J. 1998. The use of MMR, diversity-based reranking for reordering documents and producing summaries. In Proceedings of the 21st Annual International ACM-SIGIR Conference on Research and Development in Information Retrieval. 335--336. Google Scholar
- Cover, T. M. and Hart, P. E. 1967. Nearest neighbor pattern classification. IEEE Transactions on Information Theory IT-13, 21--27.Google Scholar
- Hobbs, J. 1990. Literature and Cognition, CSLI Lecture Notes 21. CSLI.Google Scholar
- Hume, D. 1748. Philosophical Essays concerning Human Understanding. Printed for A. Millar London.Google Scholar
- Ishizako, Y., Kataoka, A., Masuyama, S., Yamamoto, K., and Nakagawa, S. 2000. Reduction of overlapping expressions using dependency relations. Journal of Natural Language Processing 7, 4, 119--142. (in Japan.)Google Scholar
- Kudo, T. and Matsumoto, Y. 2002. Japanese dependency analysis using cascaded chunking. In CoNLL 2002: Proceedings of the 6th Conference on Natural Language Learning 2002 (COLING 2002 Post-Conference Workshops). 63--69. Google Scholar
- Lapata, M. 2003. Probabilistic text structuring: experiments with sentence ordering. In Proceedings of the 41st Meeting of the Association of Computational Linguistics. 545--552. Google Scholar
- Lin, C.-Y. and Hovy, E. 2001. NEATS: A multidocument summarizer. In Proceedings of the Document Understanding Conference (DUC01).Google Scholar
- Lin, C.-Y. and Hovy, E. 2002. From single to multi-document summarization: A prototype system and its evaluation. In Proceedings of the 40th Anniversary Meeting of the Association for Computational Linguistics (ACL-02). Philadelphia, PA., USA, 457--464. Google Scholar
- Luhn, H. P. 1958. The automatic creation of literature abstracts. IBM journal of Research and Development 2, 2, 159--165.Google Scholar
- Mani, I. 2001. Automatic Summarization. John Benjamins, Amsterdam.Google Scholar
- Mani, I. and Bloedorn, E. 1999. Summarizing similarities and differences among related documents. Information Retrieval 1, 1-2, 35--67. Google Scholar
- Mani, I. and Wilson, G. 2000. Robust temporal processing of news. In Proceedings of the 38th Annual Meeting of ACL'2000. 69--76. Google Scholar
- Mani, I., Schiffman, B., and Zhang, J. 2003. Inferring temporal ordering of events in news. Proceedings of the Human Language Technology Conference (HLT-NAACL) '03. Google Scholar
- Mann, W. and Thompson, S. 1988. Rhetorical structure theory: Toward a functional theory of text organization. Text 8, 243--281.Google Scholar
- McKeown, K., Klavans, J., Hatzivassiloglou, V., Barzilay, R., and Eskin, E. 1999. Towards multidocument summarization by reformulation: Progress and prospects. In Proceedings of the 17th National Conference on Artificial Intelligence. 453--460. Google Scholar
- Nagao, K. and Hasida, K. 1998. Automatic text summarization based on the global document annotation. In Proceedings of the 17th International Conference on Computational Linguistics/36th Annual Meeting of the Association for Computational Linguistics (COLING-ACL '98). Montreal, Quebec, Canada. 917--921. Google Scholar
- Okazaki, N., Matsuo, Y., Matsumura, N., Tomobe, H., and Ishizuka, M. 2002. Two different methods at NTCIR3-TSC2: Coverage oriented and focus oriented. In Working Notes of the Third NTCIR Workshop Meeting, Part V: Text Summarization Challenge 2 (TSC2). 39--46.Google Scholar
- Okazaki, N., Matsuo, Y., and Ishizuka, M. 2004. TISS: An integrated summarization system for TSC-3. In Working note of the 4th NTCIR Workshop Meeting. 436--443.Google Scholar
- Radev, D. R. and McKeown, K. 1998. Generating natural language summaries from multiple on-line sources. Computational Linguistics 24, 3, 469--500. Google Scholar
- Radev, D. R., Jing, H., and Budzikowska, M. 2000. Centroid-based summarization of multiple documents: Sentence extraction, utility-based evaluation, and user studies. In The ANLP/NAACL2000 Workshop on Automatic Summarization. 21--30. Google Scholar
- Salton, G., Wong, A., and Yang, C. S. 1975. A vector space model for automatic indexing. Communications of the ACM 18, 11, 613--620. Google Scholar
- Wada, Y., Okumura, A., Uratani, N., and Shirai, K. 2002. News sentence summarization based on importance of bunsetsu attributes. In Proceedings of the 8th Annual Meeting of the Association for Natural Language Processing. 543--546. (in Japan).Google Scholar
Index Terms
- Improving chronological ordering of sentences extracted from multiple newspaper articles
Recommendations
A bottom-up approach to sentence ordering for multi-document summarization
Ordering information is a difficult but important task for applications generating natural language texts such as multi-document summarization, question answering, and concept-to-text generation. In multi-document summarization, information is selected ...
A preference learning approach to sentence ordering for multi-document summarization
Ordering information is a difficult but an important task for applications generating natural-language texts such as multi-document summarization, question answering, and concept-to-text generation. In multi-document summarization, information is ...
Co-clustering sentences and terms for multi-document summarization
CICLing'11: Proceedings of the 12th international conference on Computational linguistics and intelligent text processing - Volume Part IITwo issues are crucial to multi-document summarization: diversity and redundancy. Content within some topically-related articles are usually redundant while the topic is delivered from diverse perspectives. This paper presents a co-clustering based ...
Comments