Skip to main content

Advertisement

Log in

Recent automatic text summarization techniques: a survey

  • Published:
Artificial Intelligence Review Aims and scope Submit manuscript

Abstract

As information is available in abundance for every topic on internet, condensing the important information in the form of summary would benefit a number of users. Hence, there is growing interest among the research community for developing new approaches to automatically summarize the text. Automatic text summarization system generates a summary, i.e. short length text that includes all the important information of the document. Since the advent of text summarization in 1950s, researchers have been trying to improve techniques for generating summaries so that machine generated summary matches with the human made summary. Summary can be generated through extractive as well as abstractive methods. Abstractive methods are highly complex as they need extensive natural language processing. Therefore, research community is focusing more on extractive summaries, trying to achieve more coherent and meaningful summaries. During a decade, several extractive approaches have been developed for automatic summary generation that implements a number of machine learning and optimization techniques. This paper presents a comprehensive survey of recent text summarization extractive approaches developed in the last decade. Their needs are identified and their advantages and disadvantages are listed in a comparative manner. A few abstractive and multilingual text summarization approaches are also covered. Summary evaluation is another challenging issue in this research field. Therefore, intrinsic as well as extrinsic both the methods of summary evaluation are described in detail along with text summarization evaluation conferences and workshops. Furthermore, evaluation results of extractive summarization approaches are presented on some shared DUC datasets. Finally this paper concludes with the discussion of useful future directions that can help researchers to identify areas where further research is needed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Notes

  1. http://champollion.sourceforge.net/.

  2. http://www-nlpir.nist.gov/related_projects/tipster_summac/.

  3. http://research.nii.ac.jp/ntcir/outline/prop-en.html.

  4. http://www-nlpir.nist.gov/projects/duc/.

  5. http://www.isi.edu/licensed-sw/see/.

  6. http://www.nist.gov/tac/.

References

  • Abuobieda A, Salim N, Albaham AT, Osman AH, Kumar YJ (2012) Text summarization features selection method using pseudo genetic-based model. In: International conference on information retrieval knowledge management, pp 193–197

  • Aliguliyev RM (2009) A new sentence similarity measure and sentence based extractive technique for automatic text summarization. Expert Syst Appl 36(4):7764–7772

    Article  Google Scholar 

  • Alguliev RM, Aliguliyev RM, Isazade NR (2013) Multiple documents summarization based on evolutionary optimization algorithm. Expert Syst Appl 40:1675–1689. doi:10.1016/j.eswa.2012.09.014

    Article  Google Scholar 

  • Alguliev RM, Aliguliyev RM, Hajirahimova MS, Mehdiyev CA (2011) MCMR: maximum coverage and minimum redundant text summarization model. Expert Syst Appl 38:14514–14522. doi:10.1016/j.eswa.2011.05.033

    Article  Google Scholar 

  • Almeida M, Martins AF (2013) Fast and robust compressive summarization with dual decomposition and multi-task learning. In: ACL (1), pp 196–206

  • Amigó E, Gonzalo J, Penas A, Verdejo F (2005) QARLA: a framework for the evaluation of text summarization systems. In: ACL ’05: proceedings of the 43rd annual meeting on association for computational linguistics, pp 280–289

  • Amati G (2003) Probability models for information retrieval based on divergence from randomness. University of Glasgow

  • Amini MR, Usunier N (2009) Incorporating prior knowledge into a transductive ranking algorithm for multi-document summarization. In: Proceedings of the 32nd annual ACM SIGIR conference on research and development in information retrieval (SIGIR’09), pp 704–705

  • Antiqueira L, Oliveira ON, Costa F, Volpe G (2009) A complex network approach to text summarization. Inf Sci 179:584–599. doi:10.1016/j.ins.2008.10.032

    Article  MATH  Google Scholar 

  • Azmi AM, Al-Thanyyan S (2012) A text summarizer for Arabic. Comput Speech Lang 26:260–273. doi:10.1016/j.csl.2012.01.002

    Article  Google Scholar 

  • Bairi RB, Iyer R, Ramakrishnan G, Bilmes J (2015) Summarization of multi-document topic hierarchies using submodular. In: Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing, pp 553–563

  • Banerjee S Mitra P, Sugiyama K (2015) Multi-document abstractive summarization using ILP based multi-sentence compression. In: Proceedings of the 24th international joint conference on artificial intelligence (IJCAI 2015), pp 1208–1214

  • Baralis E, Cagliero L, Jabeen S, Fiori A (2012) Multi-document summarization exploiting frequent itemsets. In: Symposium on applied computing (SAC’12), pp 782–786

  • Baralis E, Cagliero L, Mahoto N, Fiori A (2013) GRAPHSUM : discovering correlations among multiple terms for graph-based summarization. Inf Sci 249:96–109. doi:10.1016/j.ins.2013.06.046

    Article  MathSciNet  Google Scholar 

  • Barrera A, Verma R (2012) Combining syntax and semantics for automatic extractive single-document summarization. In: 13th international conference on computational linguistics and intelligent text processing. Springer, pp 366–377

  • Barzilay R, Lapata M (2005) Modeling local coherance: an entity-based approach. In: Proceedings of the 43rd annual meeting of the association for computational linguistics (ACL ’05), pp 141–148

  • Bing L, Li P, Liao Y, Lam W, Guo W, Passonneau RJ (2015) Abstractive multi-document summarization via phrase selection and. arXiv preprint arXiv:1506.01597

  • Boudin F, Morin E (2013) Keyphrase extraction for N-best reranking in multi-sentence compression. In: North American Chapter of the Association for Computational Linguistics (NAACL)

  • Brin S, Page L (1998) The anatomy of a large scale hypertextual web search engine. In: Proceedings of the 7th international conference on world wide web 7, pp 107–117

  • Cao Z, Wei F, Dong L, Li S, Zhou M (2015a) February. Ranking with recursive neural networks and its application to multi-document summarization. In: Twenty-ninth AAAI conference on artificial intelligence

  • Cao Z, Wei F, Dong L, Li S, Zhou M (2015b) Ranking with recursive neural networks and its application to multi-document summarization. In Twenty-ninth AAAI conference on artificial intelligence

  • Cao Z, Wei F, Li S, Li W, Zhou M, Wang H (2015c) Learning summary prior representation for extractive summarization. In: Proceedings of ACL: short papers, pp 829–833

  • Carbonell JG, Goldstein J (1998) The use of MMR, diversity-based re-ranking for re-ordering documents and producing summaries. In: Proceedings of the 21st annual international ACM SIGIR conference on research and development in information retrieval, pp 335–336

  • Carenini G, Ng RT, Zhou X (2007) Summarizing email conversations with clue words. In: Proceedings of the 16th international conference on World Wide Web. ACM. pp 91–100

  • Carenini G, Ng RT, Zhou X (2008) Summarizing emails with conversational cohesion and subjectivity. ACL 8:353–361

    Google Scholar 

  • Carlson L, Marcu D, Okurowski ME (2003) Building a discourse-tagged corpus in the framework of rhetorical structure theory. Springer, Netherlands, pp 85–112

    Google Scholar 

  • Chali Y, Hasan SA (2012) Query focused multi-document summarization: automatic data annotations and supervised learning approaches. Nat Lang Eng 18:109–145

    Article  Google Scholar 

  • Chan SWK (2006) Beyond keyword and cue-phrase matching: a sentence-based abstraction technique for information extraction. Decis Support Syst 42:759–777. doi:10.1016/j.dss.2004.11.017

    Article  Google Scholar 

  • Cilibrasi RL, Vitanyi PMB (2007) The Google similarity distance. IEEE Trans Knowl Data Eng 19:370–383

    Article  Google Scholar 

  • Deerwester S, Dumais ST, Furnas GW et al (1990) Indexing by latent semantic analysis. J Am Soc Inf Sci Technol 41:391–407

    Article  Google Scholar 

  • Dunlavy DM, O’Leary DP, Conroy JM, Schlesinger JD (2007) A system for querying, clustering and summarizing documents. Inf Process Manag 43:1588–1605

    Article  Google Scholar 

  • Erkan G, Radev D (2004) LexRank: graph-based lexical centrality as salience in text summarization. J Artif Intell Res 22:457–479

    Google Scholar 

  • Fang H, Lu W, Wu F et al (2015) Topic aspect-oriented summarization via group selection. Neurocomputing 149:1613–1619. doi:10.1016/j.neucom.2014.08.031

    Article  Google Scholar 

  • Fattah MA (2014) A hybrid machine learning model for multi-document summarization. 592–600. doi:10.1007/s10489-013-0490-0

  • Fattah MA, Ren F (2009) GA, MR, FFNN, PNN and GMM based models for automatic text summarization. Comput Speech Lang 23:126–144. doi:10.1016/j.csl.2008.04.002

    Article  Google Scholar 

  • Ferreira R, De Souza L, Dueire R et al (2013) Assessing sentence scoring techniques for extractive text summarization. Expert Syst Appl 40:5755–5764. doi:10.1016/j.eswa.2013.04.023

    Article  Google Scholar 

  • Ferreira R, de Souza Cabral L, Freitas F et al (2014) A multi-document summarization system based on statistics and linguistic treatment. Expert Syst Appl 41:5780–5787. doi:10.1016/j.eswa.2014.03.023

    Article  Google Scholar 

  • Filippova K (2010) August. Multi-sentence compression: finding shortest paths in word graphs. In: Proceedings of the 23rd international conference on computational linguistics. Association for computational linguistics, pp 322–330

  • Frank JR, Kleiman-Weiner M, Roberts DA, Niu F, Zhang C, Ré C, Soboroff I (2012) Building an entity-centric stream filtering test collection for TREC 2012. MASSACHUSETTS INST OF TECH CAMBRIDGE

  • Frey BJ, Dueck D (2007) Clustering by passing messages between data points. Science 315(5814):972–976

    Article  MathSciNet  MATH  Google Scholar 

  • Fung P, Ngai G (2006) One story, one flow: hidden Markov Story Models for multilingual multidocument summarization. ACM Trans Speech Lang 3:1–16. doi:10.1145/1149290.1151099

    Article  Google Scholar 

  • Ganesan K, Zhai C, Han J (2010) Opinosis : a graph-based approach to abstractive summarization of highly redundant opinions. In: Proceedings of the 23rd international conference on computational linguistics, pp 340–348

  • Genest PE, Lapalme G (2011) Framework for abstractive summarization using text-to-text generation. In: Proceedings of the workshop on monolingual text-to-text generation, Association for Computational Linguistics, pp 64–73

  • Giannakopoulos G, Karkaletsis V, Vouros G, Stamatopoulos P (2008) Summarization system evaluation revisited: N-gram graphs. ACM Trans Speech Lang Process 5:1–39

    Article  Google Scholar 

  • Gillick D, Favre B, Hakkani-Tur D, Bohnet B, Liu Y, Xie S (2009) The icsi/utd summarization system at tac 2009. In Proceedings of the text analysis conference workshop, Gaithersburg, MD (USA)

    Google Scholar 

  • Glavaš G, Šnajder J (2014) Event graphs for information retrieval and multi-document summarization. Expert Syst Appl 41:6904–6916. doi:10.1016/j.eswa.2014.04.004

    Article  Google Scholar 

  • Goldstein J, Mittal V, Carbonelll J, Kantrowitz M (2000) Multi-document summarization by sentence extraction. In: NAACL-ANLP 2000 workshop on automatic summarization. pp 40–48

  • Gong Y, Liu X (2001) Generic text summarization using relevance measure and latent semantic analysis. In: Proceedings of the 24st annual international ACM SIGIR conference on research and development in information retrieval. pp 19–25

  • Graff D, Kong J, Chen K, Maeda K (2003) English gigaword. Linguistic Data Consortium, Philadelphia

    Google Scholar 

  • Graham Y (2015) Re-evaluating automatic summarization with BLEU and 192 shades of ROUGE. In: Proceedings of the 2015 conference on empirical methods in natural language processing. pp 128–137

  • Grosz BJ, Weinstein S, Joshi AK (1995) Centering: a framework for modeling the local coherence of discourse. Comput Linguist 21:203–225

    Google Scholar 

  • Gupta V (2013) Hybrid algorithm for multilingual summarization of Hindi and Punjabi documents. In: Mining intelligence and knowledge exploration. Springer International Publishing, pp 717–727

  • Gupta V, Lehal GS (2010) A survey of text summarization extractive techniques. J Emerg Technol Web Intell 2:258–268. doi:10.4304/jetwi.2.3.258-268

    Google Scholar 

  • Gupta P, Pendluri VS, Vats I (2011) Summarizing text by ranking texts units according to shallow linguistic features. In: 13th international conference on advanced communication technology. pp 1620–1625

  • Haberlandt K, Bingham G (1978) Verbs contribute to the coherence of brief narratives: reading related and unrelated sentence triples. J Verbal Learn Verbal Behav 17:419–425

    Article  Google Scholar 

  • Hadi Y, Essannouni F, Thami ROH (2006) Unsupervised clustering by k-medoids for video summarization. In: ISCCSP’06 (the second international symposium on communications, control and signal processing)

  • Halliday MAK, Hasan R (1991) Language, context and text: aspects of language in a social-semiotic perspective. Oxford University Press, Oxford

    Google Scholar 

  • Harabagiu S, Lacatusu F (2005) Topic themes for multi-document summarization. In: SIGIR’ 05: proceedings of the 28th annual international ACM SIGIR conference on research and development in information retrieval. pp 202–209

  • Harabagiu S, Lacatusu F (2010) Using topic themes for multi-document summarization. ACM Trans Inf Syst 28:13:1–13:47

  • He T, Shao W, Li F, Yang Z, Ma L (2008) The automated estimation of content-terms for query-focused multi-document summarization. In: Fuzzy systems and knowledge discovery, 2008. FSKD’08. Fifth international conference on IEEE, vol 5, pp 580–584

  • He Z, Chen C, Bu J, Wang C, Zhang L, Cai D, He X (2012) Document summarization based on data reconstruction. In: AAAI

  • Hearst M (1997) TextTiling: segmenting text into multi-paragraph subtopic passages. Comput Linguist 23:33–64

    Google Scholar 

  • Heu JU, Qasim I, Lee DH (2015) FoDoSu: multi-document summarization exploiting semantic analysis based on social Folksonomy. Inf Process Manag 51(1):212–225

    Article  Google Scholar 

  • Hirao T, Yoshida Y, Nishino M, Yasuda N, Nagata M (2013) Single-document summarization as a tree knapsack problem. EMNLP 13:1515–1520

    Google Scholar 

  • Hong K, Nenkova A (2014) Improving the estimation of word importance for news multi-document summarization. In: Proceedings of EACL

  • Hong K, Marcus M, Nenkova A (2015) System combination for multi-document summarization. In: Proceedings of the 2015 conference on empirical methods in natural language processing. pp 107–117

  • Hovy E, Lin CY, Zhou L, Fukumoto J (2006) Automated summarization evaluation with basic elements. In: Proceedings of the 5th international conference on language resources and evaluation (LREC), pp 81–94

  • Huang L, He Y, Wei F, Li W (2010) Modeling document summarization as multi-objective optimization. In: Proceedings of the third international symposium on intelligent information technology and security informatics, pp 382–386

  • Jones KS (2007) Automatic summarising: the state of the art. Inf Process Manag 43:1449–1481. doi:10.1016/j.ipm.2007.03.009

    Article  Google Scholar 

  • Kabadjov M, Atkinson M, Steinberger J et al. (2010) NewsGist: a multilingual statistical news summarizer. Lecture notes in computer science (including including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics) 6323 LNAI, pp 591–594. doi:10.1007/978-3-642-15939-8_40

  • Kaljahi R, Foster J, Roturier J (2014) Semantic role labelling with minimal resources: experiments with french. In: Lexical and computational semantics (*SEM 2014), p 87

  • Kallimani JS, Srinivasa KG, Eswara Reddy B (2011) Information extraction by an abstractive text summarization for an Indian regional language. In: Natural language processing and knowledge engineering (NLP-KE), 2011 7th international conference on IEEE, pp 319–322

  • Kedzie C, McKeown K, Diaz F (2015) Predicting salient updates for disaster summarization. In: Proceedings of the 53rd annual meeting of the ACL and the 7th international conference on natural language processing. pp 1608–1617

  • Khan A, Salim N, Jaya Kumar Y (2015) A framework for multi-document abstractive summarization based on semantic role labelling. Appl Soft Comput 30:737–747. doi:10.1016/j.asoc.2015.01.070

    Article  Google Scholar 

  • Kikuchi Y, Hirao T, Takamura H, Okumura M, Nagata M (2014) Single document summarization based on nested tree structure. In: Proceedings of the 52nd annual meeting of the association for computational linguistics, vol 2, pp 315–320

  • Kim SM, Hovy E (2005) Automatic detection of opinion bearing words and sentences. In: Companion volume to the proceedings of the international joint conference on natural language processing (IJCNLP), pp 61–66

  • Kintsch W, Van Dijk TA (1978) Toward a model of text comprehension and production. Psychol Rev 85(5):363

  • Knuth DE (1977) A generalization of Dijkstra’s algorithm. Inf Process Lett 6:1–5

    Article  MathSciNet  MATH  Google Scholar 

  • Ko Y, Seo J (2004) Learning with unlabeled data for text categorization using a bootstrapping and a feature projection technique. In: Proceedings of the 42nd annual meeting of the association for computational linguistics (ACL 2004). pp 255–262

  • Ko Y, Seo J (2008) An effective sentence-extraction technique using contextual information and statistical approaches for text summarization. Pattern Recognit Lett 29:1366–1371. doi:10.1016/j.patrec.2008.02.008

    Article  Google Scholar 

  • Ko Y, Kim K, Seo J (2003) Topic keyword identification for text summarization using lexical clustering. IEICE Trans Inf Syst E86-D:1695–1701

  • Kruengkrai C, Jaruskulchai C (2003) Generic text summarization using local and global properties of sentences. In: Proceedings of the ieee/wic international conference on web intelligence (ieee/wic’03)

  • Kulesza A, Taskar B (2012) Determinantal point processes for machine learning. arXiv preprint arXiv:1207.6083

  • Kulkarni UV, Prasad RS (2010) Implementation and evaluation of evolutionary connectionist approaches to automated text summarization. J Comput Sci 6:1366–1376

    Article  Google Scholar 

  • Landauer TK, Foltz PW, Laham D (1998) An intoduction to latent semantic analysis. Discourse Process 25:259–284

    Article  Google Scholar 

  • Lee DD, Seung HS (1999) Learning the parts of objects by non-negative matrix factorization. Nature 401(6755):788–791

    Article  Google Scholar 

  • Lee J-H, Park S, Ahn C-M, Kim D (2009) Automatic generic document summarization based on non-negative matrix factorization. Inf Process Manag 45:20–34

    Article  Google Scholar 

  • Leite DS, Rino LHM (2006) Selecting a feature set to summarize texts in Brazilian Portuguese. Advances in artificial intelligence-IBERAMIA-SBIA 2006:462–471

    Google Scholar 

  • Li JW, Ng KW, Liu Y, Ong KL (2007) Enhancing the effectiveness of clustering with spectra analysis. IEEE Trans Knowl Data Eng 19:887–902

    Article  Google Scholar 

  • Li C, Liu F, Weng F, Liu Y (2013) Document summarization via guided sentence compression. In: EMNLP, pp 490–500

  • Li C, Liu Y, Zhao L (2015a) Using external resources and joint learning for bigram weighting in ilp-based multi-document summarization. In: Proceedings of NAACL-HLT, pp 778–787

  • Li P, Bing L, Lam W, Li H, Liao Y (2015b) Reader-aware multi-document summarization via sparse coding. arXiv preprint arXiv:1504.07324

  • Lin CY (2004) ROUGE: a package for automatic evaluation of summaries. In: Proceedings of ACL text summarization workshop, pp 74–81

  • Lin H, Bilmes J (2010) Multi-document summarization via budgeted maximization of submodular functions. In: Human language technologies: the 2010 annual conference of the North American chapter of the association for computational linguistics, Association for Computational Linguistics, pp 912–920

  • Lin CY, Hovy E (2000) The automated acquisition of topic signatures for text summarization. In: Proceedings of the 18th conference on computational linguistics, pp 495–501

  • Liu Y, Wang X, Zhang J, Xu H (2008) Personalized PageRank based multi-document summarization. In: Semantic computing and systems, 2008. WSCS’08. IEEE international workshop on IEEE, pp 169–173

  • Liu X, Webster JJ, Kit C (2009) An extractive text summarizer based on significant words. In: Proceedings of the 22nd international conference on computer processing of oriental languages, language technology for the knowledge-based economy, Springer, pp 168–178

  • Liu H, Yu H, Deng ZH (2015) Multi-document summarization based on two-level sparse representation model. In: Twenty-ninth AAAI conference on artificial intelligence

  • Lloret E, Palomar M (2009) A gradual combination of features for building automatic summarisation systems. Text, speech and dialogue. Springer, Berlin, pp 16–23

    Chapter  Google Scholar 

  • Lloret E, Palomar M (2011a) Analyzing the use of word graphs for abstractive text summarization. In: IMMM 2011, first international conference, pp 61–66

  • Lloret E, Palomar M (2011b) Text summarisation in progress: a literature review. Artif Intell Rev 37:1–41. doi:10.1007/s10462-011-9216-z

    Article  Google Scholar 

  • Lloret E, Palomar M (2013) Tackling redundancy in text summarization through different levels of language analysis. Comput Stand Interfaces 35:507–518. doi:10.1016/j.csi.2012.08.001

    Article  Google Scholar 

  • Lloret E, Romá-Ferri MT, Palomar M (2013) COMPENDIUM: a text summarization system for generating abstracts of research papers. Data Knowl Eng 88:164–175. doi:10.1016/j.datak.2013.08.005

  • Luhn H (1958) The automatic creation of literature abstracts. IBM J Res Dev 2:159–165

    Article  MathSciNet  Google Scholar 

  • Mani I, Maybury M (1999) Advances in automatic text summarization. MIT Press, Cambridge

    Google Scholar 

  • Manning CD, Raghavan P, Schtze H (2008) Introduction to information retrieval. Cambridge University Press, Cambridge

    Book  MATH  Google Scholar 

  • Mann W, Thompson S (1988) Rhetorical structure theory: toward a functional theory of text organization. Text 8:243–281

    Google Scholar 

  • Mendoza M, Bonilla S, Noguera C et al (2014) Extractive single-document summarization based on genetic operators and guided local search. Expert Syst Appl 41:4158–4169. doi:10.1016/j.eswa.2013.12.042

    Article  Google Scholar 

  • Mihalcea R, Tarau P (2004) TextRank: bringing order into texts. In: Conference on empirical methods in natural language processing. pp 404–411

  • Moawad IF, Aref M (2012) Semantic graph reduction approach for abstractive Text Summarization. In: Proceedings of ICCES 2012, 2012 International Conference on Computer Engineering and Systems, pp 132–138. doi:10.1109/ICCES.2012.6408498

  • Murdock VG (2006) Aspects of sentence retrieval. University of Massachusetts, Amherst

    Google Scholar 

  • Neto JL, Freitas AA, Kaestner CAA (2002) Automatic text summarization using a machine learning approach. In: Proceedings of the 16th brazilian symposium on artificial intelligence (sbia), 2507 of lnai. pp 205–215

  • Neto JL, Santos AD, Kaestner CAA, Freitas AA (2000) Document clustering and text summarization. In: Proceedings of the fourth international conference practical applications of knowledge discovery and data mining (padd-2000), pp 41–55

  • Nobata C, Satoshi S, Murata M, Uchimoto K, Utimaya M, Isahara H (2001) Sentence extraction system asssembling multiple evidence. In: Proceedings 2nd NTCIR workshop, pp 319–324

  • Orasan C (2009) Comparative evaluation of term-weighing methods for automatic summarization. J Quant Linguist 16:67–95

    Article  Google Scholar 

  • Otterbacher J, Erkan G, Radev DR (2009) Biased LexRank: passage retrieval using random walks with question-based priors. Inf Process Manag 45(1):42–54

    Article  Google Scholar 

  • Oufaida H, Philippe B, Omar Nouali (2015) Using distributed word representations and mRMR discriminant analysis for multilingual text summarization. In: Natural language processing and information systems. Springer International Publishing, pp 51–63

  • Ouyang Y, Li W, Li S, Lu Q (2011) Applying regression models to query-focused multi-document summarization. Inf Process Manag 47:227–237

    Article  Google Scholar 

  • Ouyang Y, Li W, Zhang R et al (2013) A progressive sentence selection strategy for document summarization. Inf Process Manag 49:213–221. doi:10.1016/j.ipm.2012.05.002

    Article  Google Scholar 

  • Owczarzak K (2009) DEPEVAL(summ): dependency-based evaluation for automatic summaries. In: Proceedings of the joint conference of the 47th annual meeting of the ACL and the 4th international joint conference on natural language processing of the AFNLP. pp 190–198

  • Pang B, Lee L (2008) Opinion mining and sentiment analysis. Found Trends Inf Retr 2:1–135

    Article  Google Scholar 

  • Pardo TAS, Rino LHM, Nunes MGV (2003a) Neuralsumm: a connexionist approach to automatic text summarization. In: Proceedings of the fourth Brazilian meeting artificial intelligence (ENIA). pp 1–10

  • Pardo TAS, Rino LHM, Nunes MGV (2003b) Gistsumm: a summarization tool based on a new extractive method. In: Proceedings of the sixth workshop on computational processing of written and spoken portuguese (propor), 2721 of LNAI, pp 210–218

  • Parveen D, Strube M (2015) Integrating importance, non-redundancy and coherence in graph-based extractive summarization. In: Proceedings of the 24th international conference on artificial intelligence. AAAI Press. pp 1298–1304

  • Patel A, Siddiqui T, Tiwary US (2007) A language independent approach to multilingual text summarization. In: Large scale semantic access to content (text, image, video, and sound), pp 123–132

  • Pitler E, Nenkova A (2008) Revisiting readability. In: Proceedings of the 2008 conference on empirical methods in natural language processing. pp 186–195

  • Prasad RS, Uplavikar NM, Wakhare SS, Jain VY, Avinash T (2012) Feature based text summarization. In: International journal of advances in computing and information researches

  • Quirk R, Greenbaum S, Leech G (1985) A comprehensive grammar of the English language. Longman, London and New York

    Google Scholar 

  • Radev D, Tam D (2003) Summarization evaluation using relative utility. In: CIKM ’03: proceedings of the 12th international conference on information and knowledge management, pp 508–511

  • Radev DR, Fan W, Zhang Z, Arbor A (2001) WebInEssence: a personalized web-based multi-document summarization and recommendation system. In: NAACL 2001 workshop on automatic summarization, pp 79–88

  • Radev D, Allison T, Goldensohn B et al. (2004a) MEAD: a platform for multidocument multilingual text summarization. Proc Lr, 1–4

  • Radev DR, Jing HY, Stys M, Tam D (2004b) Centroid-based summarization of multiple documents. Inf Process Manag 40:919–938

    Article  MATH  Google Scholar 

  • Riedhammer K, Favre B, Hakkani-Tur D (2010) Long story short- global unsupervised models for keyphrase based meeting summarization. Speech Commun 52:801–815

    Article  Google Scholar 

  • Rino LHM, Modolo M (2004) Supor: an environment for as of texts in brazilianportuguese. In: Espana for natural language processsing (EsTAL). pp 419–430

  • Rotem N (2011) Open text summarizer (ots). Retrieved from http://libots.sourceforge.net/

  • Rush AM, Chopra S, Weston J (2015) A neural attention model for abstractive sentence summarization. arXiv preprint arXiv:1509.00685

  • Russell SJ, Norvig P (1995) Artificial intelligence: a modern approach. Prentice-Hall International Incorporated, Englewood Cliffs

    MATH  Google Scholar 

  • Sanderson M, Croft WB (1999) Deriving concept hierarchies from text. Proceedings of SIGIR 1999:206–213

    Google Scholar 

  • Sarkar K (2010) Syntactic trimming of extracted sentences for improving extractive multi-document summarization. J Comput 2:177–184

    Google Scholar 

  • Shen C, Li T, Ding CH (2011) Integrating clustering and multi-document summarization by bi-mixture probabilistic latent semantic analysis (PLSA) with sentence bases. In: AAAI

  • Shen D, Sun J-T, Li H et al. (2007) Document summarization using conditional random fields. In: Proceedings of 20th international joint conference on artificial intelligence. pp 2862–2867

  • Simon I, Snavely N, Seitz SM (2007) Scene summarization for online image collections. In: Computer vision, 2007. ICCV 2007. IEEE 11th international conference on. IEEE. pp 1–8

  • Sipos R, Shivaswamy P, Joachims T (2012) Large-margin learning of submodular summarization models. In: Proceedings of the 13th conference of the European chapter of the association for computational linguistics, Association for Computational Linguistics, pp 224–233

  • Song W, Choi LC, Park SC, Ding XF (2011) Fuzzy evolutionary optimization modeling and its applications to unsupervised categorization and extractive summarization. Expert Syst Appl 38:9112–9121

    Article  Google Scholar 

  • Storn R, Price K (1997) Differential evolution-a simple and efficient heuristic for global optimization over continuous spaces. J Glob Optim 11(4):341–359

    Article  MathSciNet  MATH  Google Scholar 

  • Svore K, Vanderwende L, Burges C (2007) Enhancing single-document summarization by combining RankNet and third priority sources. In: Proceedings of the empirical methods on natural language processing and computational natural language learning (EMNLP-CoNLL), pp 448–457

  • Takamura H, Okumura M (2009) Text summarization model based on maximum coverage problem and its variant. In: Proceedings of the 12th conference of the European chapter of the association for computational linguistics, Association for Computational Linguistics, pp 781–789

  • Tan PN, Kumar V, Srivastava J (2002) Selecting the right interestingness measure for association patterns. In: ACM SIGKDD international conference on knowledge discovery and data mining (KDD’02). pp 32–41

  • Tang J, Yao L, Chen D (2009) Multi-topic based query-oriented summarization. SDM 9:1147–1158

    Google Scholar 

  • Tao Y, Zhou S, Lam W, Guan J (2008) Towards more text summarization based on textual association networks. In: Proceedings of the 2008 fourth international conference on semantics, knowledge and grid, pp 235–240

  • Teufel S, Halteren H (2004) Evaluating information content by factoid analysis: human annotation and stability. In: Proceedings of the 2004 conference on empirical methods in natural language processing, pp 419–426

  • Texlexan (2011) Texlexan: an open-source text summarizer. http://texlexan.sourceforge.net/

  • Tonelli S, Pianta E (2011) Matching documents and summaries using key concepts. In: Proceedings of the French text mining evaluation workshop

  • Tzouridis E, Nasir JA, Lahore LUMS, Brefeld U (2014) Learning to summarise related sentences. In: The 25th international conference on computational linguistics (COLING’14), Dublin, Ireland, ACL

  • Vadlapudi R, Katragadda R (2010) An automated evaluation of readability of summaries: capturing grammaticality, focus, structure and coherence. In: Proceedings of the NAACL HLT 2010 student research workshop. pp 7–12

  • van der Plas L, Henderson J, Merlo P (2010) D6. 2: semantic role annotation of a French-English Corpus, Computational Learning in Adaptive Systems for Spoken Conversation (CLASSiC)

  • Van der Plas L, Merlo P, Henderson J (2011) Scaling up automatic cross-lingual semantic role annotation. In: Proceedings of the 49th annual meeting of the association for computational linguistics: human language technologies: short papers, vol 2. Association for computational linguistics, pp 299–304

  • Wan X (2008) Using only cross-document relationships for both generic and topic-focused multi-document summarizations. Inf Retr 11(1):25–49

    Article  Google Scholar 

  • Wan X (2010) Towards a unified approach to simultaneous single-document and multi-document summarizations. In: Proceedings of the 23rd international conference on computational linguistics (Coling 2010), pp 1137–1145

  • Wan X, Yang J (2008) Multi-document summarization using cluster-based link analysis. In: Proceedings of the 31st annual international ACM SIGIR conference on research and development in information retrieval. ACM. pp 299–306

  • Wan X, Xiao J (2009) Graph-based multi-modality learning for topic-focused multi-document summarization. In: IJCAI. pp. 1586–1591

  • Wang D, Li T (2012) Weighted consensus multi-document summarization. Inf Process Manag 48:513–523

    Article  Google Scholar 

  • Wang C, Long L, Li L (2008a) HowNet based evaluation for Chinese text summarization. In: Proceedings of the international conference on natural language processing and software engineering. pp 82–87

  • Wang D, Li T, Zhu S, Ding C (2008b) Multi-document summarization via sentence-level semantic analysis and symmetric matrix factorization. In: Proceedings of the 31st annual international ACM SIGIR conference on research and development in information retrieval, pp 307–314

  • Wang D, Li T, Zhu S, Ding C (2009) Multi-document summarization using sentence-based topic models. In: Proceedings of the ACL-IJCNLP 2009 conference short papers, pp 297–300

  • Wang D, Li T, Ding C (2010) Weighted feature subset non-negative matrix factorization and its applications to document understanding. In: Proceedings of the 2010 IEEE international conference on data mining, pp 541–550

  • Wang D, Zhu S, Li T et al. (2011) Integrating document clustering and multi-document summarization. ACM Trans Knowl Discov Data 5:14:1–14:26

  • Wasson M (1998) Using leading text for news summaries: evaluation results and implications for commercial summarization applications. In: Proceedings of the 17th international conference on computational linguistics, vol 2. Association for computational linguistics, pp 1364–1368

  • Wei F, Li W, Lu Q, He Y (2008) Query sensitive mutual reinforcement chain and its application in query-oriented multi-document summarization. In: Proceedings of the 31st annual international acmsigir conference on research and development in information retrieval (SIGIR’08). pp 283–290

  • Wei F, Li W, Lu Q, He Y (2010) A document-sensitive graph model for multi-document summarization. Knowl Inf Syst 22(2):245–259

    Article  Google Scholar 

  • Wenjie L, Furu W, Qin L, Yanxiang H (2008) Pnr2: ranking sentences with positive and negative reinforcement for query-oriented update summarization. In: Proceedings of the 22nd international conference on computational linguistics (coling’08). pp 489–496

  • Wilson T, Hoffmann P, Somasundaran S, Kessler J, Wiebe J, Choi Y, Cardie C, Riloff E, Patwardhan S (2005) OpinionFinder: a system for subjectivity analysis. In: Proceedings of hlt/emnlp on interactive demonstrations. Association for computational linguistics. pp 34–35

  • Yang CC, Wang FL (2008) Hierarchical summaization of large documents. J Am Soc Inf Sci Technol 59:887–902

    Article  Google Scholar 

  • Yang C, Shen J, Peng J, Fan J (2013) Image collection summarization via dictionary learning for sparse representation. Pattern Recognit 46(3):948–961

    Article  Google Scholar 

  • Yang L, Cai X, Zhang Y, Shi P (2014) Enhancing sentence-level clustering with ranking-based clustering framework for theme-based summarization. Inf Sci 260:37–50. doi:10.1016/j.ins.2013.11.026

    Article  Google Scholar 

  • Yao JG, Wan X, Xiao J (2015a) Compressive document summarization via sparse optimization. In: Proceedings of the 24th international conference on artificial intelligence. AAAI Press. pp 1376–1382

  • Yao JG, Wan X, Xiao J (2015b) Phrase-based compressive cross-language summarization. In: Proceedings of the 2015 conference on empirical methods in natural language processing, pp 118–127

  • Ye S, Chua TS, Kan MY, Qiu L (2007) Document concept lattice for text understanding and summarization. Inf Process Manag 43:1643–1662. doi:10.1016/j.ipm.2007.03.010

    Article  Google Scholar 

  • Yeh J-Y, Ke H-R, Yang W-P, Meng I-H (2005) Text summarization using a trainable summarizer and latent semantic analysis. Inf Process Manag 41:75–95. doi:10.1016/j.ipm.2004.04.003

    Article  Google Scholar 

  • Yen JY (1971) Finding the k shortest loopless paths in a network. Manag Sci 17(11):712–716

    Article  MathSciNet  MATH  Google Scholar 

  • Zajic DM, Dorr BJ, Lin J (2008) Single-document and multi-document summarization techniques for e-mail threads using sentence compression. Inf Process Manag 44:1600–1610

    Article  Google Scholar 

  • Zha H (2002) Generic summarization and keyphrase extraction using mutual reinforcement principle and sentence clustering. In: Proceedings of the 25th annual international acmsigir conference on research and development in information retrieval (SIGIR’02), pp 113–120

  • Zhang J, Xu H, Cheng X (2008a) Gspsummary: a graph-based sub-topic partition algorithm for summarization. In: Proceedings of the 2008 Asia information retrieval symposium, pp 321–334

  • Zhang J, Cheng X, Wu G, Xu H (2008b) Ada sum: an adaptive model for summarization. In: Proceedings of the acm 17th conference on information and knowledge management (CIKM’08), pp 901–909

  • Zhao L, Wu L, Huang X (2009) Using query expansion in graph-based approach for query-focused multi-document summarization. Inf Process Manag 45(1):35–41

    Article  Google Scholar 

  • Zhou L, Lin CY, Munteanu DS, Hovy E (2006) ParaEval: using paraphrases to evaluate summaries to evaluate summaries automatically. In: Proceedings of the human language technology/North American association of computational linguistics conference, pp 447–454

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vishal Gupta.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gambhir, M., Gupta, V. Recent automatic text summarization techniques: a survey. Artif Intell Rev 47, 1–66 (2017). https://doi.org/10.1007/s10462-016-9475-9

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10462-016-9475-9

Keywords

Navigation