skip to main content
article

Improving chronological ordering of sentences extracted from multiple newspaper articles

Published:01 September 2005Publication History
Skip Abstract Section

Abstract

It is necessary to determine a proper arrangement of extracted sentences to generate a well-organized summary from multiple documents. This paper describes our Multi-Document Summarization (MDS) system for TSC-3. It specifically addresses an approach to coherent sentence ordering for MDS. An impediment to the use of chronological ordering, which is widely used by conventional summarization system, is that it arranges sentences without considering the presupposed information of each sentence. We propose a method to improve chronological ordering by resolving precedent information of arranging sentences. Combining the refinement algorithm with topical segmentation and chronological ordering, we address our experiments and metrics to test the effectiveness of MDS tasks. Results demonstrate that the proposed method significantly improves chronological sentence ordering. At the end of the paper, we also report an outline/evaluation of important sentence extraction and redundant clause elimination integrated in our MDS system.

References

  1. Barzilay, R. and Lee, L. 2004. Catching the drift: Probabilistic content models, with applications to generation and summarization. In HLT-NAACL 2004: Proceedings of the Main Conference. 113--120.Google ScholarGoogle Scholar
  2. Barzilay, R., Elhadad, E., and McKeown, K. 2002. Inferring strategies for sentence ordering in multidocument summarization. Journal of Artificial Intelligence Research (JAIR) 17, 35--55. Google ScholarGoogle Scholar
  3. Carbonell, J. and Goldstein, J. 1998. The use of MMR, diversity-based reranking for reordering documents and producing summaries. In Proceedings of the 21st Annual International ACM-SIGIR Conference on Research and Development in Information Retrieval. 335--336. Google ScholarGoogle Scholar
  4. Cover, T. M. and Hart, P. E. 1967. Nearest neighbor pattern classification. IEEE Transactions on Information Theory IT-13, 21--27.Google ScholarGoogle Scholar
  5. Hobbs, J. 1990. Literature and Cognition, CSLI Lecture Notes 21. CSLI.Google ScholarGoogle Scholar
  6. Hume, D. 1748. Philosophical Essays concerning Human Understanding. Printed for A. Millar London.Google ScholarGoogle Scholar
  7. Ishizako, Y., Kataoka, A., Masuyama, S., Yamamoto, K., and Nakagawa, S. 2000. Reduction of overlapping expressions using dependency relations. Journal of Natural Language Processing 7, 4, 119--142. (in Japan.)Google ScholarGoogle Scholar
  8. Kudo, T. and Matsumoto, Y. 2002. Japanese dependency analysis using cascaded chunking. In CoNLL 2002: Proceedings of the 6th Conference on Natural Language Learning 2002 (COLING 2002 Post-Conference Workshops). 63--69. Google ScholarGoogle Scholar
  9. Lapata, M. 2003. Probabilistic text structuring: experiments with sentence ordering. In Proceedings of the 41st Meeting of the Association of Computational Linguistics. 545--552. Google ScholarGoogle Scholar
  10. Lin, C.-Y. and Hovy, E. 2001. NEATS: A multidocument summarizer. In Proceedings of the Document Understanding Conference (DUC01).Google ScholarGoogle Scholar
  11. Lin, C.-Y. and Hovy, E. 2002. From single to multi-document summarization: A prototype system and its evaluation. In Proceedings of the 40th Anniversary Meeting of the Association for Computational Linguistics (ACL-02). Philadelphia, PA., USA, 457--464. Google ScholarGoogle Scholar
  12. Luhn, H. P. 1958. The automatic creation of literature abstracts. IBM journal of Research and Development 2, 2, 159--165.Google ScholarGoogle Scholar
  13. Mani, I. 2001. Automatic Summarization. John Benjamins, Amsterdam.Google ScholarGoogle Scholar
  14. Mani, I. and Bloedorn, E. 1999. Summarizing similarities and differences among related documents. Information Retrieval 1, 1-2, 35--67. Google ScholarGoogle Scholar
  15. Mani, I. and Wilson, G. 2000. Robust temporal processing of news. In Proceedings of the 38th Annual Meeting of ACL'2000. 69--76. Google ScholarGoogle Scholar
  16. Mani, I., Schiffman, B., and Zhang, J. 2003. Inferring temporal ordering of events in news. Proceedings of the Human Language Technology Conference (HLT-NAACL) '03. Google ScholarGoogle Scholar
  17. Mann, W. and Thompson, S. 1988. Rhetorical structure theory: Toward a functional theory of text organization. Text 8, 243--281.Google ScholarGoogle Scholar
  18. McKeown, K., Klavans, J., Hatzivassiloglou, V., Barzilay, R., and Eskin, E. 1999. Towards multidocument summarization by reformulation: Progress and prospects. In Proceedings of the 17th National Conference on Artificial Intelligence. 453--460. Google ScholarGoogle Scholar
  19. Nagao, K. and Hasida, K. 1998. Automatic text summarization based on the global document annotation. In Proceedings of the 17th International Conference on Computational Linguistics/36th Annual Meeting of the Association for Computational Linguistics (COLING-ACL '98). Montreal, Quebec, Canada. 917--921. Google ScholarGoogle Scholar
  20. Okazaki, N., Matsuo, Y., Matsumura, N., Tomobe, H., and Ishizuka, M. 2002. Two different methods at NTCIR3-TSC2: Coverage oriented and focus oriented. In Working Notes of the Third NTCIR Workshop Meeting, Part V: Text Summarization Challenge 2 (TSC2). 39--46.Google ScholarGoogle Scholar
  21. Okazaki, N., Matsuo, Y., and Ishizuka, M. 2004. TISS: An integrated summarization system for TSC-3. In Working note of the 4th NTCIR Workshop Meeting. 436--443.Google ScholarGoogle Scholar
  22. Radev, D. R. and McKeown, K. 1998. Generating natural language summaries from multiple on-line sources. Computational Linguistics 24, 3, 469--500. Google ScholarGoogle Scholar
  23. Radev, D. R., Jing, H., and Budzikowska, M. 2000. Centroid-based summarization of multiple documents: Sentence extraction, utility-based evaluation, and user studies. In The ANLP/NAACL2000 Workshop on Automatic Summarization. 21--30. Google ScholarGoogle Scholar
  24. Salton, G., Wong, A., and Yang, C. S. 1975. A vector space model for automatic indexing. Communications of the ACM 18, 11, 613--620. Google ScholarGoogle Scholar
  25. Wada, Y., Okumura, A., Uratani, N., and Shirai, K. 2002. News sentence summarization based on importance of bunsetsu attributes. In Proceedings of the 8th Annual Meeting of the Association for Natural Language Processing. 543--546. (in Japan).Google ScholarGoogle Scholar

Index Terms

  1. Improving chronological ordering of sentences extracted from multiple newspaper articles

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader