A Novel Biased Diversity Ranking Model for Query-Oriented Multi-Document Summarization

Article Preview

Abstract:

Query-oriented multi-document summarization (QMDS) attempts to generate a concise piece of text byextracting sentences from a target document collection, with the aim of not only conveying the key content of that corpus, also, satisfying the information needs expressed by that query. Due to its great applicable value, QMDS has been intensively studied in recent decades. Three properties are supposed crucial for a good summary, i.e., relevance, prestige and low redundancy (orso-called diversity). Unfortunately, most existing work either disregarded the concern of diversity, or handled it with non-optimized heuristics, usually based on greedy sentences election. Inspired by the manifold-ranking process, which deals with query-biased prestige, and DivRank algorithm which captures query-independent diversity ranking, in this paper, we propose a novel biased diversity ranking model, named ManifoldDivRank, for query-sensitive summarization tasks. The top-ranked sentences discovered by our algorithm not only enjoy query-oriented high prestige, more importantly, they are dissimilar with each other. Experimental results on DUC2005and DUC2006 benchmark data sets demonstrate the effectiveness of our proposal.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

2811-2816

Citation:

Online since:

August 2013

Authors:

Export:

Price:

[1] C. Shen and T. Li Learning to rank for query-focused multi-document summarization, in Data Mining (ICDM), 2011 IEEE 11th InternationalConference on. IEEE, 2011, p.626–634.

DOI: 10.1109/icdm.2011.91

Google Scholar

[2] Z. Cao, T. Qin, T. Liu, M. Tsai, and H. Li, Learning to rank: frompairwise approach to listwise approach, in Proceedings of the 24th International Conference on Machine Learning. ACM, 2007, p.129–136.

DOI: 10.1145/1273496.1273513

Google Scholar

[3] X. Wan, J. Yang, and J. Xiao, Manifold-ranking based topic-focusedmulti-document summarization, in Proceedings of the 20th International Joint Conference on Artifical Intelligence. Morgan KaufmannPublishers Inc., 2007, p.2903–2908.

Google Scholar

[4] X. Li, Y. Shen, L. Du, and C. Xiong, Exploiting novelty, coverageand balance for topic-focused multi-document summarization, in Proceedings of the 19th ACM International Conference onInformation and Knowledge Management. ACM, 2010, p.1765–1768.

DOI: 10.1145/1871437.1871724

Google Scholar

[5] D. Zhou, J. Weston, A. Gretton, O. Bousquet, and B. Schölkopf, Ranking on data manifolds, Advances in Neural InformationProcessing Systems, vol. 16, p.169–176, (2004).

Google Scholar

[6] Q. Mei, J. Guo, and D. Radev, Divrank: the interplay of prestige anddiversity in information networks, in Proceedings of the 16th ACMSIGKDD International Conference on Knowledge Discovery andData Mining. ACM, 2010, p.1009–1018.

DOI: 10.1145/1835804.1835931

Google Scholar

[7] L. Page, S. Brin, R. Motwani, and T. Winograd, The pagerank citationranking: bringing order to the web., (1999).

Google Scholar

[8] J. Kleinberg, Authoritative sources in a hyperlinked environment, Journal of the ACM (JACM), vol. 46, no. 5, p.604–632, (1999).

DOI: 10.1145/324133.324140

Google Scholar

[9] J. Otterbacher, G. Erkan, and D. Radev, Using random walks forquestion-focused sentence retrieval, in Proceedings of theConferenceon Human Language Technology and Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 2005, p.915.

DOI: 10.3115/1220575.1220690

Google Scholar

[10] R. Mihalcea and P. Tarau, Textrank: Bringing order into texts, inProceedings of EMNLP, vol. 4. Barcelona, Spain, 2004, p.404–411.

Google Scholar

[11] W. Yin, Y. Pei, F. Zhang, and L. Huang, Query-focused multi-documentsummarization based on query-sensitive feature space, in Proceedingsof the 21st ACM International Conference on Information and KnowledgeManagement. ACM, 2012, p.1652–1656.

DOI: 10.1145/2396761.2398491

Google Scholar

[12] Y. Ouyang, S. Li, and W. Li, Developing learning strategies fortopic-basedsummarization, in Proceedings of the sixteenth ACMConference on Conference on Information and KnowledgeManagement. ACM, 2007, p.79–86.

Google Scholar

[13] F. Jin, M. Huang, and X. Zhu, A comparative study on ranking andselection strategies for multi-document summarization, in Proceedingsof the 23rd International Conference on ComputationalLinguistics: Posters. Association for Computational Linguistics, 2010, p.525.

Google Scholar

[14] Y. Chali and S. Hasan, Query-focused multi-document summarization: automatic data annotations and supervised learningapproaches, NaturalLanguage Engineering, vol. 18, no. 1, p.109, (2012).

DOI: 10.1017/s1351324911000167

Google Scholar

[15] C. Lin, Rouge: A package for automatic evaluation of summaries, in Text Summarization Branches Out: Proceedings of the ACL-04Workshop, 2004, p.74–81.

Google Scholar

[16] B. Zhang, H. Li, Y. Liu, L. Ji, W. Xi, W. Fan, Z. Chen, and W. Ma, Improving web search results using affinity graph, in Proceedings ofthe 28th AnnualInternational ACM SIGIR Conference on Research andDevelopment in Information Retrieval. ACM, 2005, p.504.

DOI: 10.1145/1076034.1076120

Google Scholar