research-article

The combination of similarity measures for extractive summarization

Authors:
Hy Nguyen

University of Science, VNU-HCMC, Ho Chi Minh City, Vietnam

University of Science, VNU-HCMC, Ho Chi Minh City, Vietnam
View Profile

,
Tung Le

University of Science, VNU-HCMC, Ho Chi Minh City, Vietnam

University of Science, VNU-HCMC, Ho Chi Minh City, Vietnam
View Profile

,
Viet-Thang Luong

University of Science, VNU-HCMC, Ho Chi Minh City, Vietnam

University of Science, VNU-HCMC, Ho Chi Minh City, Vietnam
View Profile

,
Minh-Quoc Nghiem

University of Science, VNU-HCMC, Ho Chi Minh City, Vietnam

University of Science, VNU-HCMC, Ho Chi Minh City, Vietnam
View Profile

,
Dien Dinh

University of Science, VNU-HCMC, Ho Chi Minh City, Vietnam

University of Science, VNU-HCMC, Ho Chi Minh City, Vietnam
View Profile

SoICT '16: Proceedings of the 7th Symposium on Information and Communication TechnologyDecember 2016Pages 66–72https://doi.org/10.1145/3011077.3011139

Published:08 December 2016Publication History

SoICT '16: Proceedings of the 7th Symposium on Information and Communication Technology

Pages 66–72

ABSTRACT

The key task in extractive summarization is to determine the importance of the sentence in the input. Several recent studies have focused on comparing the similarity between sentences to assess the significance of them efficiently. Each comparison method has its strengths and weaknesses. In this paper, we propose the combination of similarity measures for sentence comparison. Experiments conducted on both English and Vietnamese datasets demonstrate the efficiency of our proposed approach. Our model outperforms the recent works in English with the significant improvement (9.4 ROUGE-2 F1-score) and achieves the competitive result in Vietnamese.

References

S. Banerjee and T. Pedersen. Extended gloss overlaps as a measure of semantic relatedness. In Proceedings of the 18th International Joint Conference on Artificial Intelligence, IJCAI'03, pages 805--810, 2003. Google ScholarDigital Library
J. Carbonell and J. Goldstein. The use of mmr, diversity-based reranking for reordering documents and producing summaries. In Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR '98, pages 335--336, New York, NY, USA, 1998. ACM. Google ScholarDigital Library
R. Collobert and J. Weston. A unified architecture for natural language processing: deep neural networks with multitask learning. In W. W. Cohen, A. McCallum, and S. T. Roweis, editors, ICML, volume 307 of ACM International Conference Proceeding Series, pages 160--167. ACM, 2008. Google ScholarDigital Library
R. Collobert and J. Weston. A unified architecture for natural language processing: Deep neural networks with multitask learning. In Proceedings of the 25th International Conference on Machine Learning, ICML '08, pages 160--167, New York, NY, USA, 2008. ACM. Google ScholarDigital Library
V. Dalal and L. Malik. A survey of extractive and abstractive text summarization techniques. In 2013 6th International Conference on Emerging Trends in Engineering and Technology, pages 109--110, Dec 2013. Google ScholarDigital Library
G. Erkan and D. R. Radev. Lexrank: Graph-based lexical centrality as salience in text summarization. J. Artif. Int. Res., 22(1):457--479, Dec. 2004. Google ScholarDigital Library
K. Ganesan, C. Zhai, and J. Han. Opinosis: A graph-based approach to abstractive summarization of highly redundant opinions. In Proceedings of the 23rd International Conference on Computational Linguistics, COLING '10, pages 340--348, Stroudsburg, PA, USA, 2010. Association for Computational Linguistics. Google ScholarDigital Library
M. KÃěgebÃd'ck, O. Mogren, N. Tahmasebi, and D. Dubhashi. Extractive summarization using continuous vector space models. In Proceedings of the 2nd Workshop on Continuous Vector Space Models and their Compositionality (CVSC)@ EACL, pages 31--39, 2014.Google Scholar
R. Kiros, Y. Zhu, R. R. Salakhutdinov, R. Zemel, R. Urtasun, A. Torralba, and S. Fidler. Skip-thought vectors. In Advances in Neural Information Processing Systems 28, pages 3294--3302. Curran Associates, Inc., 2015. Google ScholarDigital Library
C.-Y. Lin. Rouge: A package for automatic evaluation of summaries. In S. S. Marie-Francine Moens, editor, Text Summarization Branches Out: Proceedings of the ACL-04 Workshop, pages 74--81, Barcelona, Spain, July 2004. Association for Computational Linguistics.Google Scholar
H. Lin and J. Bilmes. A class of submodular functions for document summarization. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1, HLT '11, pages 510--520, Stroudsburg, PA, USA, 2011. Association for Computational Linguistics. Google ScholarDigital Library
D. Metzler, Y. Bernstein, W. B. Croft, A. Moffat, and J. Zobel. Similarity measures for tracking information flow. In Proceedings of the 14th ACM International Conference on Information and Knowledge Management, CIKM '05, pages 517--524, New York, NY, USA, 2005. ACM. Google ScholarDigital Library
R. Mihalcea and P. Tarau. TextRank: Bringing order into texts. In Proceedings of EMNLP-04and the 2004 Conference on Empirical Methods in Natural Language Processing, July 2004.Google Scholar
T. Mikolov, K. Chen, G. Corrado, and J. Dean. Efficient estimation of word representations in vector space. In Proceedings of Workshop at ICLR, 2013.Google Scholar
O. Mogren, M. Kågebäck, and D. P. Dubhashi. Extractive summarization by aggregating multiple similarities. In Recent Advances in Natural Language Processing, RANLP 2015, 7--9 September, 2015, Hissar, Bulgaria, pages 451--457, 2015.Google Scholar
A. Nenkova and K. McKeown. Automatic summarization. Foundations and Trends in Information Retrieval, 5(2--3):103--233, 2011.Google ScholarCross Ref
V. Ung, A. Luong, N. Tran, and M. Nghiem. Combination of features for vietnamese news multi-document summarization. In 2015 Seventh International Conference on Knowledge and Systems Engineering, KSE 2015, pages 186--191, 2015.Google ScholarCross Ref

Index Terms

The combination of similarity measures for extractive summarization
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Natural language generation

Recommendations

Sentence Relations for Extractive Summarization with Deep Neural Networks

Sentence regression is a type of extractive summarization that achieves state-of-the-art performance and is commonly used in practical systems. The most challenging task within the sentence regression framework is to identify discriminative features to ...
Read More
Abstractive Summarization Improved by WordNet-Based Extractive Sentences
Natural Language Processing and Chinese Computing
Abstract
Recently, the seq2seq abstractive summarization models have achieved good results on the CNN/Daily Mail dataset. Still, how to improve abstractive methods with extractive methods is a good research direction, since extractive methods have their ...
Read More
Exploring events and distributed representations of text in multi-document summarization

We explore an event detection framework to improve multi-document summarizationWe use distributed representations of text to address different lexical realizationsSummarization is based on the hierarchical combination of single-document summariesWe ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SoICT '16: Proceedings of the 7th Symposium on Information and Communication Technology
December 2016
442 pages
ISBN:9781450348157
DOI:10.1145/3011077
General Chairs:
Nguyen Manh Hung
NTT University, Vietnam
,
Huynh Quyet Thang
HUST, Vietnam
,
Program Chairs:
Luc De Raedt
KULeuven, Belgium
,
Yves Deville
UCLouvain, Belgium
,
Marc Bui
EPHE, France
,
Truong Thi Dieu Linh
HUST, Vietnam
,
Publications Chairs:
Dinh Viet Sang
HUST, Vietnam
,
Nguyen Hong Phuong
HUST, Vietnam
,
Nguyen Thi Oanh
HUST, Vietnam
Copyright © 2016 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 8 December 2016
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
deep learning
extractive summarization
multi-document summarization
similarity measures
Qualifiers
- research-article
Conference

Acceptance Rates
SoICT '16 Paper Acceptance Rate58of132submissions,44%Overall Acceptance Rate147of318submissions,46%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 125
  Total Downloads
- Downloads (Last 12 months)1
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

The combination of similarity measures for extractive summarization

SoICT '16: Proceedings of the 7th Symposium on Information and Communication Technology

ABSTRACT

References

Cited By

Index Terms

Recommendations

Sentence Relations for Extractive Summarization with Deep Neural Networks

Abstractive Summarization Improved by WordNet-Based Extractive Sentences

Exploring events and distributed representations of text in multi-document summarization

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

The combination of similarity measures for extractive summarization

SoICT '16: Proceedings of the 7th Symposium on Information and Communication Technology

ABSTRACT

References

Cited By

Index Terms

Recommendations

Sentence Relations for Extractive Summarization with Deep Neural Networks

Abstractive Summarization Improved by WordNet-Based Extractive Sentences

Exploring events and distributed representations of text in multi-document summarization

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media