article

Improving chronological ordering of sentences extracted from multiple newspaper articles

Authors:
Naoaki Okazaki

The University of Tokyo, Tokyo, Japan

The University of Tokyo, Tokyo, Japan
View Profile

,
Yutaka Matsuo

Cyber Assist Research Center, AIST Tokyo Waterfront, Tokyo, Japan

Cyber Assist Research Center, AIST Tokyo Waterfront, Tokyo, Japan
View Profile

,
Mitsuru Ishizuka

The University of Tokyo, Tokyo, Japan

The University of Tokyo, Tokyo, Japan
View Profile

ACM Transactions on Asian Language Information Processing Volume 4 Issue 3pp 321–339https://doi.org/10.1145/1111667.1111673

Published:01 September 2005Publication History

ACM Transactions on Asian Language Information Processing

Abstract

It is necessary to determine a proper arrangement of extracted sentences to generate a well-organized summary from multiple documents. This paper describes our Multi-Document Summarization (MDS) system for TSC-3. It specifically addresses an approach to coherent sentence ordering for MDS. An impediment to the use of chronological ordering, which is widely used by conventional summarization system, is that it arranges sentences without considering the presupposed information of each sentence. We propose a method to improve chronological ordering by resolving precedent information of arranging sentences. Combining the refinement algorithm with topical segmentation and chronological ordering, we address our experiments and metrics to test the effectiveness of MDS tasks. Results demonstrate that the proposed method significantly improves chronological sentence ordering. At the end of the paper, we also report an outline/evaluation of important sentence extraction and redundant clause elimination integrated in our MDS system.

References

Barzilay, R. and Lee, L. 2004. Catching the drift: Probabilistic content models, with applications to generation and summarization. In HLT-NAACL 2004: Proceedings of the Main Conference. 113--120.Google Scholar
Barzilay, R., Elhadad, E., and McKeown, K. 2002. Inferring strategies for sentence ordering in multidocument summarization. Journal of Artificial Intelligence Research (JAIR) 17, 35--55. Google Scholar
Carbonell, J. and Goldstein, J. 1998. The use of MMR, diversity-based reranking for reordering documents and producing summaries. In Proceedings of the 21st Annual International ACM-SIGIR Conference on Research and Development in Information Retrieval. 335--336. Google Scholar
Cover, T. M. and Hart, P. E. 1967. Nearest neighbor pattern classification. IEEE Transactions on Information Theory IT-13, 21--27.Google Scholar
Hobbs, J. 1990. Literature and Cognition, CSLI Lecture Notes 21. CSLI.Google Scholar
Hume, D. 1748. Philosophical Essays concerning Human Understanding. Printed for A. Millar London.Google Scholar
Ishizako, Y., Kataoka, A., Masuyama, S., Yamamoto, K., and Nakagawa, S. 2000. Reduction of overlapping expressions using dependency relations. Journal of Natural Language Processing 7, 4, 119--142. (in Japan.)Google Scholar
Kudo, T. and Matsumoto, Y. 2002. Japanese dependency analysis using cascaded chunking. In CoNLL 2002: Proceedings of the 6th Conference on Natural Language Learning 2002 (COLING 2002 Post-Conference Workshops). 63--69. Google Scholar
Lapata, M. 2003. Probabilistic text structuring: experiments with sentence ordering. In Proceedings of the 41st Meeting of the Association of Computational Linguistics. 545--552. Google Scholar
Lin, C.-Y. and Hovy, E. 2001. NEATS: A multidocument summarizer. In Proceedings of the Document Understanding Conference (DUC01).Google Scholar
Lin, C.-Y. and Hovy, E. 2002. From single to multi-document summarization: A prototype system and its evaluation. In Proceedings of the 40th Anniversary Meeting of the Association for Computational Linguistics (ACL-02). Philadelphia, PA., USA, 457--464. Google Scholar
Luhn, H. P. 1958. The automatic creation of literature abstracts. IBM journal of Research and Development 2, 2, 159--165.Google Scholar
Mani, I. 2001. Automatic Summarization. John Benjamins, Amsterdam.Google Scholar
Mani, I. and Bloedorn, E. 1999. Summarizing similarities and differences among related documents. Information Retrieval 1, 1-2, 35--67. Google Scholar
Mani, I. and Wilson, G. 2000. Robust temporal processing of news. In Proceedings of the 38th Annual Meeting of ACL'2000. 69--76. Google Scholar
Mani, I., Schiffman, B., and Zhang, J. 2003. Inferring temporal ordering of events in news. Proceedings of the Human Language Technology Conference (HLT-NAACL) '03. Google Scholar
Mann, W. and Thompson, S. 1988. Rhetorical structure theory: Toward a functional theory of text organization. Text 8, 243--281.Google Scholar
McKeown, K., Klavans, J., Hatzivassiloglou, V., Barzilay, R., and Eskin, E. 1999. Towards multidocument summarization by reformulation: Progress and prospects. In Proceedings of the 17th National Conference on Artificial Intelligence. 453--460. Google Scholar
Nagao, K. and Hasida, K. 1998. Automatic text summarization based on the global document annotation. In Proceedings of the 17th International Conference on Computational Linguistics/36th Annual Meeting of the Association for Computational Linguistics (COLING-ACL '98). Montreal, Quebec, Canada. 917--921. Google Scholar
Okazaki, N., Matsuo, Y., Matsumura, N., Tomobe, H., and Ishizuka, M. 2002. Two different methods at NTCIR3-TSC2: Coverage oriented and focus oriented. In Working Notes of the Third NTCIR Workshop Meeting, Part V: Text Summarization Challenge 2 (TSC2). 39--46.Google Scholar
Okazaki, N., Matsuo, Y., and Ishizuka, M. 2004. TISS: An integrated summarization system for TSC-3. In Working note of the 4th NTCIR Workshop Meeting. 436--443.Google Scholar
Radev, D. R. and McKeown, K. 1998. Generating natural language summaries from multiple on-line sources. Computational Linguistics 24, 3, 469--500. Google Scholar
Radev, D. R., Jing, H., and Budzikowska, M. 2000. Centroid-based summarization of multiple documents: Sentence extraction, utility-based evaluation, and user studies. In The ANLP/NAACL2000 Workshop on Automatic Summarization. 21--30. Google Scholar
Salton, G., Wong, A., and Yang, C. S. 1975. A vector space model for automatic indexing. Communications of the ACM 18, 11, 613--620. Google Scholar
Wada, Y., Okumura, A., Uratani, N., and Shirai, K. 2002. News sentence summarization based on importance of bunsetsu attributes. In Proceedings of the 8th Annual Meeting of the Association for Natural Language Processing. 543--546. (in Japan).Google Scholar

Index Terms

Improving chronological ordering of sentences extracted from multiple newspaper articles
1. Applied computing
  1. Computers in other domains
    1. Publishing
  2. Document management and text processing
    1. Document preparation
      1. Document scripting languages
      2. Markup languages
2. Information systems
  1. World Wide Web
    1. Web data description languages
      1. Markup languages

Recommendations

A bottom-up approach to sentence ordering for multi-document summarization

Ordering information is a difficult but important task for applications generating natural language texts such as multi-document summarization, question answering, and concept-to-text generation. In multi-document summarization, information is selected ...
Read More
A preference learning approach to sentence ordering for multi-document summarization

Ordering information is a difficult but an important task for applications generating natural-language texts such as multi-document summarization, question answering, and concept-to-text generation. In multi-document summarization, information is ...
Read More
Co-clustering sentences and terms for multi-document summarization
CICLing'11: Proceedings of the 12th international conference on Computational linguistics and intelligent text processing - Volume Part II

Two issues are crucial to multi-document summarization: diversity and redundancy. Content within some topically-related articles are usually redundant while the topic is delivered from diverse perspectives. This paper presents a co-clustering based ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM Transactions on Asian Language Information Processing Volume 4, Issue 3
September 2005
138 pages
ISSN:1530-0226
EISSN:1558-3430
DOI:10.1145/1111667
Issue’s Table of Contents

Copyright © 2005 ACM
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 September 2005
Published in talip Volume 4, Issue 3

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Multi-document summarization
arrange
coherence
order
sentence ordering
Qualifiers
- article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 5
  Total Citations
  View Citations
- 410
  Total Downloads
- Downloads (Last 12 months)4
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Improving chronological ordering of sentences extracted from multiple newspaper articles

ACM Transactions on Asian Language Information Processing

Abstract

References

Cited By

Index Terms

Recommendations

A bottom-up approach to sentence ordering for multi-document summarization

A preference learning approach to sentence ordering for multi-document summarization

Co-clustering sentences and terms for multi-document summarization

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Improving chronological ordering of sentences extracted from multiple newspaper articles

ACM Transactions on Asian Language Information Processing

Abstract

References

Cited By

Index Terms

Recommendations

A bottom-up approach to sentence ordering for multi-document summarization

A preference learning approach to sentence ordering for multi-document summarization

Co-clustering sentences and terms for multi-document summarization

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media