Skip to main content
Log in

Recent advances in deep learning based dialogue systems: a systematic survey

  • Published:
Artificial Intelligence Review Aims and scope Submit manuscript

Abstract

Dialogue systems are a popular natural language processing (NLP) task as it is promising in real-life applications. It is also a complicated task since many NLP tasks deserving study are involved. As a result, a multitude of novel works on this task are carried out, and most of them are deep learning based due to their outstanding performance. In this survey, we mainly focus on the deep learning based dialogue systems. We comprehensively review state-of-the-art research outcomes in dialogue systems and analyze them from two angles: model type and system type. Specifically, from the angle of model type, we discuss the principles, characteristics, and applications of different models that are widely used in dialogue systems. This will help researchers acquaint these models and see how they are applied in state-of-the-art frameworks, which is rather helpful when designing a new dialogue system. From the angle of system type, we discuss task-oriented and open-domain dialogue systems as two streams of research, providing insight into the hot topics related. Furthermore, we comprehensively review the evaluation methods and datasets for dialogue systems to pave the way for future research. Finally, some possible research trends are identified based on the recent research outcomes. To the best of our knowledge, this survey is the most comprehensive and up-to-date one at present for deep learning based dialogue systems, extensively covering the popular techniques. We speculate that this work is a good starting point for academics who are new to the dialogue systems or those who want to quickly grasp up-to-date techniques in this area.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17

Similar content being viewed by others

Notes

  1. Statistic source: https://markets.businessinsider.com.

  2. Statistic source: https://outgrow.co.

  3. The quality of being logical and consistent not only between words/subwords but also between responses of different timesteps.

  4. Template filling is an efficient approach to extract and structure complex information from text to fill in a pre-defined template. They are mostly used in task-oriented dialogue systems.

  5. https://openai.com/blog/better-language-models/.

  6. Stochastic gradient ascent simply uses the negated objective function of stochastic gradient descent.

References

  • Abro WA, Qi G, Ali Z, Feng Y, Aamir M (2020) Multi-turn intent determination and slot filling with neural networks and regular expressions. Knowl-Based Syst 208:106428

    Article  Google Scholar 

  • Abro WA, Aicher A, Rach N, Ultes S, Minker W, Qi G (2022) Natural language understanding for argumentative dialogue systems in the opinion building domain. Knowl-Based Syst 242:108318

    Article  Google Scholar 

  • Agarwal S, Bui T, Lee JY, Konstas I, Rieser V (2020) History for visual dialog: Do we really need it? In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, Online, pp 8182–8197, https://doi.org/10.18653/v1/2020.acl-main.728

  • Aghajanyan A, Maillard J, Shrivastava A, Diedrick K, Haeger M, Li H, Mehdad Y, Stoyanov V, Kumar A, Lewis M, Gupta S (2020) Conversational semantic parsing. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Association for Computational Linguistics, Online, pp 5026–5035, https://doi.org/10.18653/v1/2020.emnlp-main.408

  • Akama R, Yokoi S, Suzuki J, Inui K (2020) Filtering noisy dialogue corpora by connectivity and content relatedness. In: Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP), association for computational linguistics, online, pp 941–958, https://doi.org/10.18653/v1/2020.emnlp-main.68

  • Alberti C, Ling J, Collins M, Reitter D (2019) Fusion of detected objects in text for visual question answering. In: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP), association for computational linguistics, Hong Kong, China, pp 2131–2140, https://doi.org/10.18653/v1/D19-1219

  • Aloysius N, Geetha M (2017) A review on deep convolutional neural networks. In: 2017 international conference on communication and signal processing (ICCSP), IEEE, pp 0588–0592

  • Arora S, Batra K, Singh S (2013) Dialogue system: a brief review. arXiv:1306.4134

  • Asghar N, Poupart P, Jiang X, Li H (2017) Deep active learning for dialogue generation. In: Proceedings of the 6th joint conference on lexical and computational semantics (SEM 2017), association for computational linguistics, Vancouver, Canada, pp 78–83, https://doi.org/10.18653/v1/S17-1008

  • Asri LE, He J, Suleman K (2016) A sequence-to-sequence model for user simulation in spoken dialogue systems. In: Morgan N (ed) Interspeech 2016, 17th annual conference of the international speech communication association, San Francisco, CA, USA, September 8–12, 2016, ISCA, pp 1151–1155, https://doi.org/10.21437/Interspeech.2016-1175

  • Aubert X, Dugast C, Ney H, Steinbiss V (1994) Large vocabulary continuous speech recognition of wall street journal data. In: Proceedings of ICASSP’94. IEEE International conference on acoustics, speech and signal processing, IEEE, vol 2, pp II–129

  • Bahdanau D, Cho K, Bengio Y (2015) Neural machine translation by jointly learning to align and translate. In: Bengio Y, LeCun Y (eds) 3rd international conference on learning representations, ICLR 2015, San Diego, CA, USA, May 7–9, 2015, Conference Track Proceedings, arXiv:1409.0473

  • Baheti A, Ritter A, Small K (2020) Fluent response generation for conversational question answering. In: Proceedings of the 58th annual meeting of the association for computational linguistics, association for computational linguistics, online, pp 191–207, https://doi.org/10.18653/v1/2020.acl-main.19

  • Balakrishnan A, Rao J, Upasani K, White M, Subba R (2019) Constrained decoding for neural NLG from compositional representations in task-oriented dialogue. In: Proceedings of the 57th annual meeting of the association for computational linguistics, association for computational linguistics, Florence, Italy, pp 831–844, https://doi.org/10.18653/v1/P19-1080

  • Banerjee S, Lavie A (2005) METEOR: An automatic metric for MT evaluation with improved correlation with human judgments. In: Proceedings of the ACL workshop on intrinsic and extrinsic evaluation measures for machine translation and/or summarization, association for computational linguistics, Ann Arbor, Michigan, pp 65–72, https://aclanthology.org/W05-0909

  • Bao S, He H, Wang F, Lian R, Wu H (2019) Know more about each other: Evolving dialogue strategy via compound assessment. In: Proceedings of the 57th annual meeting of the association for computational linguistics, association for computational linguistics, Florence, Italy, pp 5382–5391, https://doi.org/10.18653/v1/P19-1535

  • Bao S, He H, Wang F, Wu H, Wang H (2020) PLATO: Pre-trained dialogue generation model with discrete latent variable. In: Proceedings of the 58th annual meeting of the association for computational linguistics, association for computational linguistics, online, pp 85–96, https://doi.org/10.18653/v1/2020.acl-main.9

  • Bapna A, Tür G, Hakkani-Tür D, Heck LP (2017) Towards zero-shot frame semantic parsing for domain scaling. In: Lacerda F (ed) Interspeech 2017, 18th annual conference of the international speech communication association, Stockholm, Sweden, August 20-24, 2017, ISCA, pp 2476–2480, http://www.isca-speech.org/archive/Interspeech_2017/abstracts/0518.html

  • Beeferman D, Brannon W, Roy D (2019) Radiotalk: A large-scale corpus of talk radio transcripts. In: Kubin G, Kacic Z (eds) Interspeech 2019, 20th annual conference of the international speech communication association, Graz, Austria, 15–19 September 2019, ISCA, pp 564–568, https://doi.org/10.21437/Interspeech.2019-2714

  • Bengio Y, Simard P, Frasconi P (1994) Learning long-term dependencies with gradient descent is difficult. IEEE Trans Neural Netw 5(2):157–166

    Article  Google Scholar 

  • Bevendorff J, Al Khatib K, Potthast M, Stein B (2020) Crawling and preprocessing mailing lists at scale for dialog analysis. In: Proceedings of the 58th annual meeting of the association for computational linguistics, association for computational linguistics, online, pp 1151–1158, https://doi.org/10.18653/v1/2020.acl-main.108

  • Bi W, Gao J, Liu X, Shi S (2019) Fine-grained sentence functions for short-text conversation. In: Proceedings of the 57th annual meeting of the association for computational linguistics, association for computational Linguistics, Florence, Italy, pp 3984–3993, https://doi.org/10.18653/v1/P19-1389

  • Bordes A, Usunier N, García-Durán A, Weston J, Yakhnenko O (2013) Translating embeddings for modeling multi-relational data. In: Burges CJC, Bottou L, Ghahramani Z, Weinberger KQ (eds) Advances in neural information processing systems 26: 27th annual conference on neural information processing systems 2013. Proceedings of a meeting held December 5–8, 2013, Lake Tahoe, Nevada, United States, pp 2787–2795, https://proceedings.neurips.cc/paper/2013/hash/1cecc7a77928ca8133fa24680a88d2f9-Abstract.html

  • Bordes A, Glorot X, Weston J, Bengio Y (2014) A semantic matching energy function for learning with multi-relational data. Mach Learn 94(2):233–259

    Article  MathSciNet  MATH  Google Scholar 

  • Bordes A, Boureau Y, Weston J (2017) Learning end-to-end goal-oriented dialog. In: 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings, OpenReview.net, https://openreview.net/forum?id=S1Bb3D5gg

  • Bosselut A, Rashkin H, Sap M, Malaviya C, Celikyilmaz A, Choi Y (2019) COMET: Commonsense transformers for automatic knowledge graph construction. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, Florence, Italy, pp 4762–4779, https://doi.org/10.18653/v1/P19-1470

  • Bouchacourt D, Baroni M (2019) Miss tools and mr fruit: Emergent communication in agents learning about object affordances. In: Proceedings of the 57th annual meeting of the association for computational linguistics, association for computational linguistics, Florence, Italy, pp 3909–3918, https://doi.org/10.18653/v1/P19-1380

  • Boyd A, Puri R, Shoeybi M, Patwary M, Catanzaro B (2020) Large scale multi-actor generative dialog modeling. In: Proceedings of the 58th annual meeting of the association for computational linguistics, association for computational linguistics, Online, pp 66–84, https://doi.org/10.18653/v1/2020.acl-main.8

  • Bruni E, Fernández R (2017) Adversarial evaluation for open-domain dialogue generation. In: Proceedings of the 18th Annual SIGdial Meeting on Discourse and Dialogue, Association for Computational Linguistics, Saarbrücken, Germany, pp 284–288, https://doi.org/10.18653/v1/W17-5534

  • Budzianowski P, Wen TH, Tseng BH, Casanueva I, Ultes S, Ramadan O, Gašić M (2018) MultiWOZ - a large-scale multi-domain Wizard-of-Oz dataset for task-oriented dialogue modelling. In: Proceedings of the 2018 conference on empirical methods in natural language processing, association for computational linguistics, Brussels, Belgium, pp 5016–5026, https://doi.org/10.18653/v1/D18-1547

  • Busso C, Bulut M, Lee CC, Kazemzadeh A, Mower E, Kim S, Chang JN, Lee S, Narayanan SS (2008) Iemocap: Interactive emotional dyadic motion capture database. Lang Resour Eval 42(4):335–359

    Article  Google Scholar 

  • Byrne B, Krishnamoorthi K, Sankar C, Neelakantan A, Goodrich B, Duckworth D, Yavuz S, Dubey A, Kim KY, Cedilnik A (2019) Taskmaster-1: Toward a realistic and diverse dialog dataset. In: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP), association for computational linguistics, Hong Kong, China, pp 4516–4525, https://doi.org/10.18653/v1/D19-1459

  • Cahill L, Doran C, Evans R, Mellish C, Paiva D, Reape M, Scott D, Tipper N (1999) In search of a reference architecture for nlg systems. In: Proceedings of the 7th European workshop on natural language generation, Citeseer, pp 77–85

  • Campagna G, Foryciarz A, Moradshahi M, Lam M (2020) Zero-shot transfer learning with synthesized data for multi-domain dialogue state tracking. In: Proceedings of the 58th annual meeting of the association for computational linguistics, association for computational linguistics, online, pp 122–132, https://doi.org/10.18653/v1/2020.acl-main.12

  • Cao J, Tanana M, Imel Z, Poitras E, Atkins D, Srikumar V (2019) Observing dialogue in therapy: Categorizing and forecasting behavioral codes. In: Proceedings of the 57th annual meeting of the association for computational linguistics, association for computational linguistics, Florence, Italy, pp 5599–5611, https://doi.org/10.18653/v1/P19-1563

  • Carlson L, Okurowski ME, Marcu D (2002) RST discourse treebank. Linguistic Data Consortium, University of Pennsylvania

    Google Scholar 

  • Casanueva I, Temčinas T, Gerz D, Henderson M, Vulić I (2020) Efficient intent detection with dual sentence encoders. In: Proceedings of the 2nd workshop on natural language processing for conversational AI, association for computational linguistics, online, pp 38–45, https://doi.org/10.18653/v1/2020.nlp4convai-1.5

  • Chandramohan S, Geist M, Lefevre F, Pietquin O (2011) User simulation in dialogue systems using inverse reinforcement learning. In: Twelfth annual conference of the international speech communication association

  • Chauhan H, Firdaus M, Ekbal A, Bhattacharyya P (2019) Ordinal and attribute aware response generation in a multimodal dialogue system. In: Proceedings of the 57th annual meeting of the association for computational linguistics, association for computational linguistics, Florence, Italy, pp 5437–5447, https://doi.org/10.18653/v1/P19-1540

  • Chen H, Liu X, Yin D, Tang J (2017) A survey on dialogue systems: Recent advances and new frontiers. Acm Sigkdd Explorations Newslett 19(2):25–35

    Article  Google Scholar 

  • Chen J, Yang D (2020) Multi-view sequence-to-sequence models with conversational structure for abstractive dialogue summarization. In: Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP), association for computational linguistics, online, pp 4106–4118, https://doi.org/10.18653/v1/2020.emnlp-main.336

  • Chen J, Zhang R, Mao Y, Xu J (2020a) Parallel interactive networks for multi-domain dialogue state generation. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Association for Computational Linguistics, Online, pp 1921–1931, https://doi.org/10.18653/v1/2020.emnlp-main.151

  • Chen L, Zhou X, Chang C, Yang R, Yu K (2017b) Agent-aware dropout DQN for safe and efficient on-line dialogue policy learning. In: Proceedings of the 2017 conference on empirical methods in natural language processing, association for computational linguistics, Copenhagen, Denmark, pp 2454–2464, https://doi.org/10.18653/v1/D17-1260

  • Chen M, Liu R, Shen L, Yuan S, Zhou J, Wu Y, He X, Zhou B (2020b) The JDDC corpus: A large-scale multi-turn Chinese dialogue dataset for E-commerce customer service. In: Proceedings of the 12th language resources and evaluation conference, European language resources association, Marseille, France, pp 459–466, https://aclanthology.org/2020.lrec-1.58

  • Chen W, Chen J, Qin P, Yan X, Wang WY (2019a) Semantically conditioned dialog response generation via hierarchical disentangled self-attention. In: Proceedings of the 57th annual meeting of the association for computational linguistics, association for computational linguistics, Florence, Italy, pp 3696–3709, https://doi.org/10.18653/v1/P19-1360

  • Chen X, Xu J, Xu B (2019b) A working memory model for task-oriented dialog response generation. In: Proceedings of the 57th annual meeting of the association for computational linguistics, association for computational linguistics, Florence, Italy, pp 2687–2693, https://doi.org/10.18653/v1/P19-1258

  • Chen X, Meng F, Li P, Chen F, Xu S, Xu B, Zhou J (2020c) Bridging the gap between prior and posterior knowledge selection for knowledge-grounded dialogue generation. In: Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP), association for computational linguistics, online, pp 3426–3437, https://doi.org/10.18653/v1/2020.emnlp-main.275

  • Chen Y, Hakkani-Tür D, Tür G, Gao J, Deng L (2016) End-to-end memory networks with knowledge carryover for multi-turn spoken language understanding. In: Morgan N (ed) Interspeech 2016, 17th Annual conference of the international speech communication association, San Francisco, CA, USA, September 8–12, 2016, ISCA, pp 3245–3249, https://doi.org/10.21437/Interspeech.2016-312

  • Chen YC, Li L, Yu L, El Kholy A, Ahmed F, Gan Z, Cheng Y, Liu J (2019c) Uniter: Learning universal image-text representations. ECCV

  • Cheng J, Agrawal D, Martínez Alonso H, Bhargava S, Driesen J, Flego F, Kaplan D, Kartsaklis D, Li L, Piraviperumal D, Williams JD, Yu H, Ó Séaghdha D, Johannsen A (2020) Conversational semantic parsing for dialog state tracking. In: Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP), association for computational linguistics, online, pp 8107–8117, https://doi.org/10.18653/v1/2020.emnlp-main.651

  • Cho H, May J (2020) Grounding conversations with improvised dialogues. In: Proceedings of the 58th annual meeting of the association for computational linguistics, association for computational linguistics, online, pp 2398–2413, https://doi.org/10.18653/v1/2020.acl-main.218

  • Cho K, van Merriënboer B, Bahdanau D, Bengio Y (2014a) On the properties of neural machine translation: Encoder–decoder approaches. In: Proceedings of SSST-8, eighth workshop on syntax, semantics and structure in statistical translation, association for computational linguistics, Doha, Qatar, pp 103–111, https://doi.org/10.3115/v1/W14-4012

  • Cho K, van Merriënboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014b) Learning phrase representations using RNN encoder–decoder for statistical machine translation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), association for computational linguistics, Doha, Qatar, pp 1724–1734, https://doi.org/10.3115/v1/D14-1179

  • Choi E, He H, Iyyer M, Yatskar M, Yih Wt, Choi Y, Liang P, Zettlemoyer L (2018) QuAC: Question answering in context. In: Proceedings of the 2018 conference on empirical methods in natural language processing, association for computational linguistics, Brussels, Belgium, pp 2174–2184, https://doi.org/10.18653/v1/D18-1241

  • Chung J, Gulcehre C, Cho K, Bengio Y (2014) Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv:1412.3555

  • Chung YL, Kuzmenko E, Tekiroglu SS, Guerini M (2019) CONAN - COunter NArratives through nichesourcing: a multilingual dataset of responses to fight online hate speech. In: Proceedings of the 57th annual meeting of the association for computational linguistics, association for computational linguistics, Florence, Italy, pp 2819–2829, https://doi.org/10.18653/v1/P19-1271

  • Cogswell M, Lu J, Jain R, Lee S, Parikh D, Batra D (2020) Dialog without dialog data: Learning visual dialog agents from VQA data. In: Larochelle H, Ranzato M, Hadsell R, Balcan M, Lin H (eds) Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual, https://proceedings.neurips.cc/paper/2020/hash/e7023ba77a45f7e84c5ee8a28dd63585-Abstract.html

  • Conneau A, Schwenk H, Barrault L, Lecun Y (2017) Very deep convolutional networks for text classification. In: Proceedings of the 15th conference of the European chapter of the association for computational linguistics: volume 1, long papers, association for computational linguistics, Valencia, Spain, pp 1107–1116, https://aclanthology.org/E17-1104

  • Coope S, Farghly T, Gerz D, Vulić I, Henderson M (2020) Span-ConveRT: Few-shot span extraction for dialog with pretrained conversational representations. In: Proceedings of the 58th annual meeting of the association for computational linguistics, association for computational linguistics, Online, pp 107–121, https://doi.org/10.18653/v1/2020.acl-main.11

  • Csáky R, Purgai P, Recski G (2019) Improving neural conversational models with entropy-based data filtering. In: Proceedings of the 57th annual meeting of the association for computational linguistics, association for computational linguistics, Florence, Italy, pp 5650–5669, https://doi.org/10.18653/v1/P19-1567

  • Cui L, Wu Y, Liu S, Zhang Y, Zhou M (2020) MuTual: A dataset for multi-turn dialogue reasoning. In: Proceedings of the 58th annual meeting of the association for computational linguistics, association for computational linguistics, online, pp 1406–1416, https://doi.org/10.18653/v1/2020.acl-main.130

  • Dai Y, Li H, Tang C, Li Y, Sun J, Zhu X (2020) Learning low-resource end-to-end goal-oriented dialog for fast and reliable system deployment. In: Proceedings of the 58th annual meeting of the association for computational linguistics, association for computational linguistics, online, pp 609–618, https://doi.org/10.18653/v1/2020.acl-main.57

  • Dai Z, Yang Z, Yang Y, Carbonell J, Le Q, Salakhutdinov R (2019) Transformer-XL: Attentive language models beyond a fixed-length context. In: Proceedings of the 57th annual meeting of the association for computational linguistics, association for computational linguistics, Florence, Italy, pp 2978–2988, https://doi.org/10.18653/v1/P19-1285

  • Dalton J, Xiong C, Callan J (2020) Trec cast 2019: The conversational assistance track overview. http://arxiv.org/abs/2003.13624

  • Danescu-Niculescu-Mizil C, Lee L (2011) Chameleons in imagined conversations: A new approach to understanding coordination of linguistic style in dialogs. In: Proceedings of the 2nd workshop on cognitive modeling and computational linguistics, association for computational linguistics, Portland, Oregon, USA, pp 76–87, https://aclanthology.org/W11-0609

  • Danescu-Niculescu-Mizil C, Sudhof M, Jurafsky D, Leskovec J, Potts C (2013) A computational approach to politeness with application to social factors. In: Proceedings of the 51st annual meeting of the association for computational linguistics (volume 1: long papers), association for computational linguistics, Sofia, Bulgaria, pp 250–259, https://aclanthology.org/P13-1025

  • Deng L, Tur G, He X, Hakkani-Tur D (2012) Use of kernel deep convex networks and end-to-end learning for spoken language understanding. In: 2012 IEEE spoken language technology workshop (SLT), IEEE, pp 210–215

  • Deoras A, Sarikaya R (2013) Deep belief network based semantic taggers for spoken language understanding. In: Interspeech, pp 2713–2717

  • Deriu J, Tuggener D, von Däniken P, Campos JA, Rodrigo A, Belkacem T, Soroa A, Agirre E, Cieliebak M (2020) Spot the bot: A robust and efficient framework for the evaluation of conversational dialogue systems. In: Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP), association for computational linguistics, online, pp 3971–3984, https://doi.org/10.18653/v1/2020.emnlp-main.326

  • Devlin J, Chang MW, Lee K, Toutanova K (2019) BERT: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 1 (long and short papers), association for computational linguistics, Minneapolis, Minnesota, pp 4171–4186, https://doi.org/10.18653/v1/N19-1423

  • Dhingra B, Li L, Li X, Gao J, Chen YN, Ahmed F, Deng L (2017) Towards end-to-end reinforcement learning of dialogue agents for information access. In: Proceedings of the 55th annual meeting of the association for computational linguistics (volume 1: long papers), association for computational linguistics, Vancouver, Canada, pp 484–495, https://doi.org/10.18653/v1/P17-1045

  • Dinan E, Logacheva V, Malykh V, Miller A, Shuster K, Urbanek J, Kiela D, Szlam A, Serban I, Lowe R, et al. (2019a) The second conversational intelligence challenge (convai2). https://arxiv.org/abs/1902.00098

  • Dinan E, Roller S, Shuster K, Fan A, Auli M, Weston J (2019b) Wizard of wikipedia: Knowledge-powered conversational agents. In: 7th International conference on learning representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019, OpenReview.net, https://openreview.net/forum?id=r1l73iRqKm

  • Dong L, Huang S, Wei F, Lapata M, Zhou M, Xu K (2017) Learning to generate product reviews from attributes. In: Proceedings of the 15th conference of the European chapter of the association for computational linguistics: volume 1, long papers, association for computational linguistics, Valencia, Spain, pp 623–632, https://aclanthology.org/E17-1059

  • Du N, Chen K, Kannan A, Tran L, Chen Y, Shafran I (2019) Extracting symptoms and their status from clinical conversations. In: Proceedings of the 57th annual meeting of the association for computational linguistics, association for computational linguistics, Florence, Italy, pp 915–925, https://doi.org/10.18653/v1/P19-1087

  • Du W, Black AW (2019) Boosting dialog response generation. In: Proceedings of the 57th annual meeting of the association for computational linguistics, association for computational linguistics, Florence, Italy, pp 38–43, https://doi.org/10.18653/v1/P19-1005

  • Dušek O, Jurčíček F (2016a) A context-aware natural language generator for dialogue systems. In: Proceedings of the 17th annual meeting of the special interest group on discourse and dialogue, association for computational linguistics, Los Angeles, pp 185–190, https://doi.org/10.18653/v1/W16-3622

  • Dušek O, Jurčíček F (2016b) Sequence-to-sequence generation for spoken dialogue via deep syntax trees and strings. In: Proceedings of the 54th annual meeting of the association for computational linguistics (volume 2: short papers), association for computational linguistics, Berlin, Germany, pp 45–51, https://doi.org/10.18653/v1/P16-2008

  • El Asri L, Schulz H, Sharma S, Zumer J, Harris J, Fine E, Mehrotra R, Suleman K (2017) Frames: a corpus for adding memory to goal-oriented dialogue systems. In: Proceedings of the 18th annual sigdial meeting on discourse and dialogue, association for computational linguistics, Saarbrücken, Germany, pp 207–219, https://doi.org/10.18653/v1/W17-5526

  • Elder H, O’Connor A, Foster J (2020) How to make neural natural language generation as reliable as templates in task-oriented dialogue. In: Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP), association for computational linguistics, online, pp 2877–2888, https://doi.org/10.18653/v1/2020.emnlp-main.230

  • Elman JL (1990) Finding structure in time. Cogn Sci 14(2):179–211

    Article  Google Scholar 

  • Eric M, Krishnan L, Charette F, Manning CD (2017) Key-value retrieval networks for task-oriented dialogue. In: Proceedings of the 18th annual SIGdial meeting on discourse and dialogue, association for computational linguistics, Saarbrücken, Germany, pp 37–49, https://doi.org/10.18653/v1/W17-5506

  • Estève Y, Bazillon T, Antoine JY, Béchet F, Farinas J (2010) The EPAC corpus: Manual and automatic annotations of conversational speech in French broadcast news. In: Proceedings of the seventh international conference on language resources and evaluation (LREC’10), European Language Resources Association (ELRA), Valletta, Malta, http://www.lrec-conf.org/proceedings/lrec2010/pdf/650_Paper.pdf

  • Fan A, Jernite Y, Perez E, Grangier D, Weston J, Auli M (2019) ELI5: Long form question answering. In: Proceedings of the 57th annual meeting of the association for computational linguistics, association for computational linguistics, Florence, Italy, pp 3558–3567, https://doi.org/10.18653/v1/P19-1346

  • Fan M, Zhou Q, Chang E, Zheng TF (2014) Transition-based knowledge graph embedding with relational mapping properties. In: Proceedings of the 28th Pacific asia conference on language, information and computing, department of linguistics, Chulalongkorn University, Phuket, Thailand, pp 328–337, https://aclanthology.org/Y14-1039

  • Feldman Y, El-Yaniv R (2019) Multi-hop paragraph retrieval for open-domain question answering. In: Proceedings of the 57th annual meeting of the association for computational linguistics, association for computational linguistics, Florence, Italy, pp 2296–2309, https://doi.org/10.18653/v1/P19-1222

  • Feng J, Tao C, Wu W, Feng Y, Zhao D, Yan R (2019) Learning a matching model with co-teaching for multi-turn response selection in retrieval-based dialogue systems. In: Proceedings of the 57th annual meeting of the association for computational linguistics, association for computational linguistics, Florence, Italy, pp 3805–3815, https://doi.org/10.18653/v1/P19-1370

  • Feng S, Chen H, Li K, Yin D (2020a) Posterior-gan: Towards informative and coherent response generation with posterior generative adversarial network. In: The Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2020, The Thirty-Second Innovative Applications of Artificial Intelligence Conference, IAAI 2020, The Tenth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2020, New York, NY, USA, February 7-12, 2020, AAAI Press, pp 7708–7715, https://aaai.org/ojs/index.php/AAAI/article/view/6273

  • Feng S, Ren X, Chen H, Sun B, Li K, Sun X (2020b) Regularizing dialogue generation by imitating implicit scenarios. In: Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP), association for computational linguistics, online, pp 6592–6604, https://doi.org/10.18653/v1/2020.emnlp-main.534

  • Feng S, Wan H, Gunasekara C, Patel S, Joshi S, Lastras L (2020c) doc2dial: A goal-oriented document-grounded dialogue dataset. In: Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP), association for computational linguistics, online, pp 8118–8128, https://doi.org/10.18653/v1/2020.emnlp-main.652

  • Ferracane E, Durrett G, Li JJ, Erk K (2019) Evaluating discourse in structured text representations. In: Proceedings of the 57th annual meeting of the association for computational linguistics, association for computational linguistics, Florence, Italy, pp 646–653, https://doi.org/10.18653/v1/P19-1062

  • Ficler J, Goldberg Y (2017) Controlling linguistic style aspects in neural language generation. In: Proceedings of the workshop on stylistic variation, association for computational linguistics, Copenhagen, Denmark, pp 94–104, https://doi.org/10.18653/v1/W17-4912

  • Finn C, Abbeel P, Levine S (2017) Model-agnostic meta-learning for fast adaptation of deep networks. In: Precup D, Teh YW (eds) Proceedings of the 34th international conference on machine learning, ICML 2017, Sydney, NSW, Australia, 6–11 August 2017, PMLR, proceedings of machine learning research, vol 70, pp 1126–1135, http://proceedings.mlr.press/v70/finn17a.html

  • Fung P, Dey A, Siddique FB, Lin R, Yang Y, Bertero D, Wan Y, Chan RHY, Wu CS (2016) Zara: A virtual interactive dialogue system incorporating emotion, sentiment and personality recognition. In: Proceedings of COLING 2016, the 26th international conference on computational linguistics: system demonstrations, The COLING 2016 Organizing Committee, Osaka, Japan, pp 278–281, https://aclanthology.org/C16-2058

  • Galley M, Brockett C, Sordoni A, Ji Y, Auli M, Quirk C, Mitchell M, Gao J, Dolan B (2015) deltaBLEU: A discriminative metric for generation tasks with intrinsically diverse targets. In: Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing (volume 2: short papers), association for computational linguistics, Beijing, China, pp 445–450, https://doi.org/10.3115/v1/P15-2073

  • Gan Z, Cheng Y, Kholy A, Li L, Liu J, Gao J (2019) Multi-step reasoning via recurrent dual attention for visual dialog. In: Proceedings of the 57th annual meeting of the association for computational linguistics, association for computational linguistics, Florence, Italy, pp 6463–6474, https://doi.org/10.18653/v1/P19-1648

  • Gan Z, Chen Y, Li L, Zhu C, Cheng Y, Liu J (2020) Large-scale adversarial training for vision-and-language representation learning. In: Larochelle H, Ranzato M, Hadsell R, Balcan M, Lin H (eds) Advances in neural information processing systems 33: annual conference on neural information processing systems 2020, NeurIPS 2020, December 6-12, 2020, virtual, https://proceedings.neurips.cc/paper/2020/hash/49562478de4c54fafd4ec46fdb297de5-Abstract.html

  • Gangadharaiah R, Narayanaswamy B (2020) Recursive template-based frame generation for task oriented dialog. In: Proceedings of the 58th annual meeting of the association for computational linguistics, association for computational linguistics, online, pp 2059–2064, https://doi.org/10.18653/v1/2020.acl-main.186

  • Gao J, Galley M, Li L (2018) Neural approaches to conversational AI. In: Collins-Thompson K, Mei Q, Davison BD, Liu Y, Yilmaz E (eds) The 41st international ACM SIGIR conference on research & development in information retrieval, SIGIR 2018, Ann Arbor, MI, USA, July 08–12, 2018, ACM, pp 1371–1374, https://doi.org/10.1145/3209978.3210183

  • Gao S, Zhang Y, Ou Z, Yu Z (2020a) Paraphrase augmented task-oriented dialog generation. In: Proceedings of the 58th annual meeting of the association for computational linguistics, association for computational linguistics, online, pp 639–649, https://doi.org/10.18653/v1/2020.acl-main.60

  • Gao X, Zhang Y, Lee S, Galley M, Brockett C, Gao J, Dolan B (2019) Structuring latent spaces for stylized response generation. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP), association for computational linguistics, Hong Kong, China, pp 1814–1823, https://doi.org/10.18653/v1/D19-1190

  • Gao X, Zhang Y, Galley M, Brockett C, Dolan B (2020b) Dialogue response ranking training with large-scale human feedback data. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Association for Computational Linguistics, Online, pp 386–395, https://doi.org/10.18653/v1/2020.emnlp-main.28

  • Gao Y, Wu CS, Joty S, Xiong C, Socher R, King I, Lyu M, Hoi SC (2020c) Explicit memory tracker with coarse-to-fine reasoning for conversational machine reading. In: Proceedings of the 58th annual meeting of the association for computational linguistics, association for computational linguistics, online, pp 935–945, https://doi.org/10.18653/v1/2020.acl-main.88

  • Gehring J, Auli M, Grangier D, Yarats D, Dauphin YN (2017) Convolutional sequence to sequence learning. In: Precup D, Teh YW (eds) Proceedings of the 34th international conference on machine learning, ICML 2017, Sydney, NSW, Australia, 6–11 August 2017, PMLR, proceedings of machine learning research, vol 70, pp 1243–1252, http://proceedings.mlr.press/v70/gehring17a.html

  • Ghazvininejad M, Brockett C, Chang M, Dolan B, Gao J, Yih W, Galley M (2018) A knowledge-grounded neural conversation model. In: McIlraith SA, Weinberger KQ (eds) Proceedings of the thirty-second AAAI conference on artificial intelligence, (AAAI-18), the 30th innovative applications of artificial intelligence (IAAI-18), and the 8th AAAI symposium on educational advances in artificial intelligence (EAAI-18), New Orleans, Louisiana, USA, February 2–7, 2018, AAAI Press, pp 5110–5117, https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/16710

  • Gliwa B, Mochol I, Biesek M, Wawer A (2019) SAMSum corpus: A human-annotated dialogue dataset for abstractive summarization. In: Proceedings of the 2nd workshop on new frontiers in summarization, association for computational linguistics, Hong Kong, China, pp 70–79, https://doi.org/10.18653/v1/D19-5409

  • Goddeau D, Meng H, Polifroni J, Seneff S, Busayapongchai S (1996) A form-based dialogue manager for spoken language applications. In: Proceeding of fourth international conference on spoken language processing. ICSLP’96, IEEE, vol 2, pp 701–704

  • Golovanov S, Kurbanov R, Nikolenko S, Truskovskyi K, Tselousov A, Wolf T (2019) Large-scale transfer learning for natural language generation. In: Proceedings of the 57th annual meeting of the association for computational linguistics, association for computational linguistics, Florence, Italy, pp 6053–6058, https://doi.org/10.18653/v1/P19-1608

  • Goodfellow IJ, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial networks. arXiv:1406.2661

  • Gopalakrishnan K, Hedayatnia B, Chen Q, Gottardi A, Kwatra S, Venkatesh A, Gabriel R, Hakkani-Tür D (2019) Topical-chat: Towards knowledge-grounded open-domain conversations. In: Kubin G, Kacic Z (eds) Interspeech 2019, 20th annual conference of the international speech communication association, Graz, Austria, 15–19 September 2019, ISCA, pp 1891–1895, https://doi.org/10.21437/Interspeech.2019-3079

  • Gordon-Hall G, Gorinski PJ, Cohen SB (2020) Learning dialog policies from weak demonstrations. In: Proceedings of the 58th annual meeting of the association for computational linguistics, association for computational linguistics, online, pp 1394–1405, https://doi.org/10.18653/v1/2020.acl-main.129

  • Graves A, Wayne G, Danihelka I (2014) Neural turing machines

  • Graves A, Wayne G, Reynolds M, Harley T, Danihelka I, Grabska-Barwińska A, Colmenarejo SG, Grefenstette E, Ramalho T, Agapiou J et al (2016) Hybrid computing using a neural network with dynamic external memory. Nature 538(7626):471–476

    Article  Google Scholar 

  • Gruber N, Jockisch A (2020) Are gru cells more specific and lstm cells more sensitive in motive classification of text? Front Artif Intell 3(40):1–6

    Google Scholar 

  • Gu J, Lu Z, Li H, Li VO (2016) Incorporating copying mechanism in sequence-to-sequence learning. In: Proceedings of the 54th annual meeting of the association for computational linguistics (volume 1: long papers), association for computational linguistics, Berlin, Germany, pp 1631–1640, https://doi.org/10.18653/v1/P16-1154

  • Guo Q, Qiu X, Liu P, Shao Y, Xue X, Zhang Z (2019) Star-transformer. In: Proceedings of the 2019 conference of the North American CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, VOLUME 1 (LONG AND SHORT PAPERS), ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, Minneapolis, Minnesota, pp 1315–1325, https://doi.org/10.18653/v1/N19-1133

  • Guo X, Yu M, Gao Y, Gan C, Campbell M, Chang S (2020) Interactive fiction game playing as multi-paragraph reading comprehension with reinforcement learning. In: Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP), association for computational linguistics, online, pp 7755–7765, https://doi.org/10.18653/v1/2020.emnlp-main.624

  • Gür I, Hakkani-Tür D, Tür G, Shah P (2018) User modeling for task oriented dialogues. In: 2018 IEEE spoken language technology workshop (SLT), IEEE, pp 900–906

  • Haber J, Baumgärtner T, Takmaz E, Gelderloos L, Bruni E, Fernández R (2019) The PhotoBook dataset: Building common ground through visually-grounded dialogue. In: Proceedings of the 57th annual meeting of the association for computational linguistics, association for computational linguistics, Florence, Italy, pp 1895–1910, https://doi.org/10.18653/v1/P19-1184

  • Hahn M, Krantz J, Batra D, Parikh D, Rehg J, Lee S, Anderson P (2020) Where are you? Localization from embodied dialog. In: Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP), association for computational linguistics, online, pp 806–822, https://doi.org/10.18653/v1/2020.emnlp-main.59

  • Hakkani-Tür D, Tür G, Celikyilmaz A, Chen Y, Gao J, Deng L, Wang Y (2016) Multi-domain joint semantic frame parsing using bi-directional RNN-LSTM. In: Morgan N (ed) Interspeech 2016, 17th annual conference of the international speech communication association, San Francisco, CA, USA, September 8–12, 2016, ISCA, pp 715–719, https://doi.org/10.21437/Interspeech.2016-402

  • Ham D, Lee JG, Jang Y, Kim KE (2020) End-to-end neural pipeline for goal-oriented dialogue systems using GPT-2. In: Proceedings of the 58th annual meeting of the association for computational linguistics, association for computational linguistics, online, pp 583–592, https://doi.org/10.18653/v1/2020.acl-main.54

  • Han M, Kang M, Jung H, Hwang SJ (2019) Episodic memory reader: Learning what to remember for question answering from streaming data. In: Proceedings of the 57th annual meeting of the association for computational linguistics, association for computational linguistics, Florence, Italy, pp 4407–4417, https://doi.org/10.18653/v1/P19-1434

  • Hancock B, Bordes A, Mazare PE, Weston J (2019) Learning from dialogue after deployment: Feed yourself, chatbot! In: Proceedings of the 57th annual meeting of the association for computational linguistics, association for computational linguistics, Florence, Italy, pp 3667–3684, https://doi.org/10.18653/v1/P19-1358

  • Hashemi HB, Asiaee A, Kraft R (2016) Query intent detection using convolutional neural networks. In: International conference on web search and data mining, workshop on query understanding

  • He H, Balakrishnan A, Eric M, Liang P (2017) Learning symmetric collaborative dialogue agents with dynamic knowledge graph embeddings. In: Proceedings of the 55th annual meeting of the association for computational linguistics (volume 1: long papers), association for computational linguistics, Vancouver, Canada, pp 1766–1776, https://doi.org/10.18653/v1/P17-1162

  • He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: 2016 IEEE conference on computer vision and pattern recognition, CVPR 2016, Las Vegas, NV, USA, June 27–30, 2016, IEEE Computer Society, pp 770–778, https://doi.org/10.1109/CVPR.2016.90

  • He T, Glass J (2020) Negative training for neural dialogue response generation. In: Proceedings of the 58th annual meeting of the association for computational linguistics, association for computational linguistics, online, pp 2044–2058, https://doi.org/10.18653/v1/2020.acl-main.185

  • He W, Yang M, Yan R, Li C, Shen Y, Xu R (2020a) Amalgamating knowledge from two teachers for task-oriented dialogue system with adversarial training. In: Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP), association for computational linguistics, online, pp 3498–3507, https://doi.org/10.18653/v1/2020.emnlp-main.281

  • He X, Chen S, Ju Z, Dong X, Fang H, Wang S, Yang Y, Zeng J, Zhang R, Zhang R, et al. (2020b) Meddialog: Two large-scale medical dialogue datasets

  • Henderson J, Lemon O, Georgila K (2008) Hybrid reinforcement/supervised learning of dialogue policies from fixed data sets. Comput Linguist 34(4):487–511. https://doi.org/10.1162/coli.2008.07-028-R2-05-82

    Article  Google Scholar 

  • Henderson M (2015) Machine learning for dialog state tracking: A review. In: Proceedings of the first international workshop on machine learning in spoken language processing

  • Henderson M, Thomson B, Young S (2013) Deep neural network approach for the dialog state tracking challenge. In: Proceedings of the SIGDIAL 2013 conference, association for computational linguistics, Metz, France, pp 467–471, https://aclanthology.org/W13-4073

  • Henderson M, Thomson B, Williams JD (2014a) The second dialog state tracking challenge. In: Proceedings of the 15th annual meeting of the special interest group on discourse and dialogue (SIGDIAL), association for computational linguistics, Philadelphia, PA, U.S.A., pp 263–272, https://doi.org/10.3115/v1/W14-4337

  • Henderson M, Thomson B, Williams JD (2014b) The third dialog state tracking challenge. In: 2014 IEEE spoken language technology workshop (SLT), IEEE, pp 324–329

  • Henderson M, Budzianowski P, Casanueva I, Coope S, Gerz D, Kumar G, Mrkšić N, Spithourakis G, Su PH, Vulić I, Wen TH (2019a) A repository of conversational datasets. In: Proceedings of the first workshop on NLP for conversational AI, association for computational linguistics, Florence, Italy, pp 1–10, https://doi.org/10.18653/v1/W19-4101

  • Henderson M, Vulić I, Gerz D, Casanueva I, Budzianowski P, Coope S, Spithourakis G, Wen TH, Mrkšić N, Su PH (2019b) Training neural response selection for task-oriented dialogue systems. In: Proceedings of the 57th annual meeting of the association for computational linguistics, association for computational linguistics, Florence, Italy, pp 5392–5404, https://doi.org/10.18653/v1/P19-1536

  • Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780

    Article  Google Scholar 

  • Hochreiter S, Bengio Y, Frasconi P, Schmidhuber J, et al. (2001) Gradient flow in recurrent nets: the difficulty of learning long-term dependencies

  • Hokamp C, Liu Q (2017) Lexically constrained decoding for sequence generation using grid beam search. In: Proceedings of the 55th annual meeting of the association for computational linguistics (volume 1: long papers), association for computational linguistics, Vancouver, Canada, pp 1535–1546, https://doi.org/10.18653/v1/P17-1141

  • Hopfield JJ (1982) Neural networks and physical systems with emergent collective computational abilities. Proc Natl Acad Sci 79(8):2554–2558

    Article  MathSciNet  MATH  Google Scholar 

  • Hosseini-Asl E, McCann B, Wu C, Yavuz S, Socher R (2020) A simple language model for task-oriented dialogue. In: Larochelle H, Ranzato M, Hadsell R, Balcan M, Lin H (eds) Advances in neural information processing systems 33: annual conference on neural information processing systems 2020, NeurIPS 2020, December 6-12, 2020, virtual, https://proceedings.neurips.cc/paper/2020/hash/e946209592563be0f01c844ab2170f0c-Abstract.html

  • Hu J, Yang Y, Chen C, He L, Yu Z (2020) SAS: Dialogue state tracking via slot attention and slot information sharing. In: Proceedings of the 58th annual meeting of the association for computational linguistics, association for computational linguistics, online, pp 6366–6375, https://doi.org/10.18653/v1/2020.acl-main.567

  • Hu JE, Rudinger R, Post M, Durme BV (2019) PARABANK: monolingual bitext generation and sentential paraphrasing via lexically-constrained neural machine translation. In: The thirty-third AAAI conference on artificial intelligence, AAAI 2019, the thirty-first innovative applications of artificial intelligence conference, IAAI 2019, the ninth AAAI symposium on educational advances in artificial intelligence, EAAI 2019, Honolulu, Hawaii, USA, January 27–February 1, 2019, AAAI Press, pp 6521–6528, https://doi.org/10.1609/aaai.v33i01.33016521

  • Hu Z, Yang Z, Liang X, Salakhutdinov R, Xing EP (2017) Toward controlled generation of text. In: Precup D, Teh YW (eds) Proceedings of the 34th international conference on machine learning, ICML 2017, Sydney, NSW, Australia, 6–11 August 2017, PMLR, proceedings of machine learning research, vol 70, pp 1587–1596, http://proceedings.mlr.press/v70/hu17e.html

  • Hua X, Wang L (2019) Sentence-level content planning and style specification for neural text generation. In: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP), association for computational linguistics, Hong Kong, China, pp 591–602, https://doi.org/10.18653/v1/D19-1055

  • Hua Y, Li YF, Haffari G, Qi G, Wu T (2020) Few-shot complex knowledge base question answering via meta reinforcement learning. In: Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP), association for computational linguistics, online, pp 5827–5837, https://doi.org/10.18653/v1/2020.emnlp-main.469

  • Huang L, Ye Z, Qin J, Lin L, Liang X (2020a) GRADE: Automatic graph-enhanced coherence metric for evaluating open-domain dialogue systems. In: Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP), association for computational linguistics, online, pp 9230–9240, https://doi.org/10.18653/v1/2020.emnlp-main.742

  • Huang X, Jiang J, Zhao D, Feng Y, Hong Y (2018) Natural language processing and Chinese computing: 6th CCF international conference, NLPCC 2017, Dalian, China, November 8–12, 2017, Proceedings, vol 10619. Springer

    Book  Google Scholar 

  • Huang X, Qi J, Sun Y, Zhang R (2020b) Semi-supervised dialogue policy learning via stochastic reward estimation. In: Proceedings of the 58th annual meeting of the association for computational linguistics, association for computational linguistics, online, pp 660–670, https://doi.org/10.18653/v1/2020.acl-main.62

  • Huang Y, Feng J, Hu M, Wu X, Du X, Ma S (2020c) Meta-reinforced multi-domain state generator for dialogue systems. In: Proceedings of the 58th annual meeting of the association for computational linguistics, association for computational linguistics, online, pp 7109–7118, https://doi.org/10.18653/v1/2020.acl-main.636

  • Huang Z, Zeng Z, Liu B, Fu D, Fu J (2020d) Pixel-bert: aligning image pixels with text by deep multi-modal transformers. https://arxiv.org/abs/2004.00849

  • Jaderberg M, Mnih V, Czarnecki WM, Schaul T, Leibo JZ, Silver D, Kavukcuoglu K (2017) Reinforcement learning with unsupervised auxiliary tasks. In: 5th international conference on learning representations, ICLR 2017, Toulon, France, April 24–26, 2017, Conference Track Proceedings, OpenReview.net, https://openreview.net/forum?id=SJ6yPD5xg

  • Jang Y, Song Y, Yu Y, Kim Y, Kim G (2017) TGIF-QA: toward spatio-temporal reasoning in visual question answering. In: 2017 IEEE conference on computer vision and pattern recognition, CVPR 2017, Honolulu, HI, USA, July 21-26, 2017, IEEE Computer Society, pp 1359–1367, https://doi.org/10.1109/CVPR.2017.149

  • Jaques N, Shen JH, Ghandeharioun A, Ferguson C, Lapedriza A, Jones N, Gu S, Picard R (2020) Human-centric dialog training via offline reinforcement learning. In: Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP), association for computational linguistics, online, pp 3985–4003, https://doi.org/10.18653/v1/2020.emnlp-main.327

  • Ji C, Zhou X, Zhang Y, Liu X, Sun C, Zhu C, Zhao T (2020) Cross copy network for dialogue generation. In: Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP), association for computational linguistics, online, pp 1900–1910, https://doi.org/10.18653/v1/2020.emnlp-main.149

  • Ji G, He S, Xu L, Liu K, Zhao J (2015) Knowledge graph embedding via dynamic mapping matrix. In: Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing (volume 1: long papers), association for computational linguistics, Beijing, China, pp 687–696, https://doi.org/10.3115/v1/P15-1067

  • Ji S, Pan S, Cambria E, Marttinen P, Yu PS (2022) A survey on knowledge graphs: Representation, acquisition and applications. IEEE Trans Neural Netw Learn Syst 33(10):1–8

    MathSciNet  Google Scholar 

  • Jia Q, Liu Y, Ren S, Zhu K, Tang H (2020) Multi-turn response selection using dialogue dependency relations. In: Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP), association for computational linguistics, online, pp 1911–1920, https://doi.org/10.18653/v1/2020.emnlp-main.150

  • Jordan M (1986) Serial order: a parallel distributed processing approach. Technical report, June 1985–March 1986. Tech. rep., California Univ., San Diego, La Jolla (USA). Inst. for Cognitive Science

  • Jung J, Son B, Lyu S (2020) AttnIO: knowledge graph exploration with in-and-out attention flow for knowledge-grounded dialogue. In: Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP), association for computational linguistics, online, pp 3484–3497, https://doi.org/10.18653/v1/2020.emnlp-main.280

  • Jurafsky D (1997) Switchboard swbd-damsl shallow-discourse-function annotation coders manual. Institute of Cognitive Science Technical Report

  • K M A, Basu Roy Chowdhury S, Dukkipati A (2018) Learning beyond datasets: Knowledge graph augmented neural networks for natural language processing. In: Proceedings of the 2018 conference of the north american chapter of the association for computational linguistics: human language technologies, volume 1 (long papers), association for computational linguistics, New Orleans, Louisiana, pp 313–322, https://doi.org/10.18653/v1/N18-1029

  • Kale M, Rastogi A (2020) Template guided text generation for task-oriented dialogue. In: Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP), association for computational linguistics, online, pp 6505–6520, https://doi.org/10.18653/v1/2020.emnlp-main.527

  • Kamezawa H, Nishida N, Shimizu N, Miyazaki T, Nakayama H (2020) A visually-grounded first-person dialogue dataset with verbal and non-verbal responses. In: Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP), association for computational linguistics, online, pp 3299–3310, https://doi.org/10.18653/v1/2020.emnlp-main.267

  • Kannan A, Vinyals O (2017) Adversarial evaluation of dialogue models. https://arxiv.org/abs/1701.08198

  • Keskar NS, McCann B, Varshney LR, Xiong C, Socher R (2019) Ctrl: A conditional transformer language model for controllable generation. https://arxiv.org/abs/1909.05858

  • Kim A, Song HJ, Park SB, et al. (2018) A two-step neural dialog state tracker for task-oriented dialog processing. Computational intelligence and neuroscience 2018

  • Kim H, Kim B, Kim G (2020a) Will I sound like me? improving persona consistency in dialogues through pragmatic self-consciousness. In: Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP), association for computational linguistics, online, pp 904–916, https://doi.org/10.18653/v1/2020.emnlp-main.65

  • Kim S, D’Haro LF, Banchs RE, Williams JD, Henderson M, Yoshino K (2016) The fifth dialog state tracking challenge. In: 2016 IEEE Spoken Language Technology Workshop (SLT), IEEE, pp 511–517

  • Kim S, D’Haro LF, Banchs RE, Williams JD, Henderson M (2017) The fourth dialog state tracking challenge. In: Dialogues with social robots. Springer, pp 435–449

  • Kim S, Yang S, Kim G, Lee SW (2020b) Efficient dialogue state tracking by selectively overwriting memory. In: Proceedings of the 58th annual meeting of the association for computational linguistics, association for computational linguistics, online, pp 567–582, https://doi.org/10.18653/v1/2020.acl-main.53

  • Ko WJ, Ray A, Shen Y, Jin H (2020) Generating dialogue responses from a semantic latent space. In: Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP), association for computational linguistics, online, pp 4339–4349, https://doi.org/10.18653/v1/2020.emnlp-main.352

  • Konda VR, Tsitsiklis JN (2000) Actor-critic algorithms. In: Advances in neural information processing systems, Citeseer, pp 1008–1014

  • Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Bartlett PL, Pereira FCN, Burges CJC, Bottou L, Weinberger KQ (eds) Advances in neural information processing systems 25: 26th annual conference on neural information processing systems 2012. Proceedings of a meeting held December 3–6, 2012, Lake Tahoe, Nevada, United States, pp 1106–1114, https://proceedings.neurips.cc/paper/2012/hash/c399862d3b9d6b76c8436e924a68c45b-Abstract.html

  • Kummerfeld JK, Gouravajhala SR, Peper JJ, Athreya V, Gunasekara C, Ganhotra J, Patel SS, Polymenakos LC, Lasecki W (2019) A large-scale corpus for conversation disentanglement. In: Proceedings of the 57th annual meeting of the association for computational linguistics, association for computational linguistics, Florence, Italy, pp 3846–3856, https://doi.org/10.18653/v1/P19-1374

  • Kundu S, Lin Q, Ng HT (2020) Learning to identify follow-up questions in conversational question answering. In: Proceedings of the 58th annual meeting of the association for computational linguistics, association for computational linguistics, online, pp 959–968 https://doi.org/10.18653/v1/2020.acl-main.90

  • Kurach K, Andrychowicz M, Sutskever I (2016) Neural random-access machines. In: Bengio Y, LeCun Y (eds) 4th international conference on learning representations, ICLR 2016, San Juan, Puerto Rico, May 2–4, 2016, conference track proceedings, http://arxiv.org/abs/1511.06392

  • Larson S, Mahendran A, Peper JJ, Clarke C, Lee A, Hill P, Kummerfeld JK, Leach K, Laurenzano MA, Tang L, Mars J (2019) An evaluation dataset for intent classification and out-of-scope prediction. In: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP), association for computational linguistics, Hong Kong, China, pp 1311–1316, https://doi.org/10.18653/v1/D19-1131

  • Le H, Hoi SC (2020) Video-grounded dialogues with pretrained generation language models. In: Proceedings of the 58th annual meeting of the association for computational linguistics, association for computational linguistics, online, pp 5842–5848, https://doi.org/10.18653/v1/2020.acl-main.518

  • Le H, Sahoo D, Chen N, Hoi S (2019) Multimodal transformer networks for end-to-end video-grounded dialogue systems. In: Proceedings of the 57th annual meeting of the association for computational linguistics, association for computational linguistics, Florence, Italy, pp 5612–5623, https://doi.org/10.18653/v1/P19-1564

  • Le H, Sahoo D, Chen N, Hoi SC (2020a) BiST: Bi-directional spatio-temporal reasoning for video-grounded dialogues. In: Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP), association for computational linguistics, online, pp 1846–1859, https://doi.org/10.18653/v1/2020.emnlp-main.145

  • Le H, Sahoo D, Liu C, Chen N, Hoi SC (2020b) UniConv: a unified conversational neural architecture for multi-domain task-oriented dialogues. In: Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP), association for computational linguistics, online, pp 1860–1877, https://doi.org/10.18653/v1/2020.emnlp-main.146

  • LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324

    Article  Google Scholar 

  • Lee JY, Dernoncourt F (2016) Sequential short-text classification with recurrent and convolutional neural networks. In: Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies, association for computational linguistics, San Diego, California, pp 515–520, https://doi.org/10.18653/v1/N16-1062

  • Lee S (2013) Structured discriminative model for dialog state tracking. In: Proceedings of the SIGDIAL 2013 conference, association for computational linguistics, Metz, France, pp 442–451, https://aclanthology.org/W13-4069

  • Lee S, Eskenazi M (2013) Recipe for building robust spoken dialog state trackers: Dialog state tracking challenge system description. In: Proceedings of the SIGDIAL 2013 conference, association for computational linguistics, Metz, France, pp 414–422, https://aclanthology.org/W13-4066

  • Lee S, Jha R (2019) Zero-shot adaptive transfer for conversational language understanding. In: The thirty-third AAAI conference on artificial intelligence, AAAI 2019, the thirty-first innovative applications of artificial intelligence conference, IAAI 2019, The Ninth AAAI symposium on educational advances in artificial intelligence, EAAI 2019, Honolulu, Hawaii, USA, January 27–February 1, 2019, AAAI Press, pp 6642–6649, https://doi.org/10.1609/aaai.v33i01.33016642

  • Lee S, Schulz H, Atkinson A, Gao J, Suleman K, El Asri L, Adada M, Huang M, Sharma S, Tay W et al (2019) Multi-domain task-completion dialog challenge. Dialog Syst Technol Chall 8:9

    Google Scholar 

  • Lei W, Jin X, Kan MY, Ren Z, He X, Yin D (2018) Sequicity: Simplifying task-oriented dialogue systems with single sequence-to-sequence architectures. In: Proceedings of the 56th annual meeting of the association for computational linguistics (volume 1: long papers), association for computational linguistics, Melbourne, Australia, pp 1437–1447, https://doi.org/10.18653/v1/P18-1133

  • Lemon O, Pietquin O (2007) Machine learning for spoken dialogue systems. In: Eighth annual conference of the international speech communication association

  • Li G, Duan N, Fang Y, Gong M, Jiang D (2020a) Unicoder-vl: A universal encoder for vision and language by cross-modal pre-training. In: The thirty-fourth aaai conference on artificial intelligence, AAAI 2020, the thirty-second innovative applications of artificial intelligence conference, IAAI 2020, the tenth AAAI symposium on educational advances in artificial intelligence, EAAI 2020, New York, NY, USA, February 7–12, 2020, AAAI Press, pp 11336–11344, https://aaai.org/ojs/index.php/AAAI/article/view/6795

  • Li J, Galley M, Brockett C, Gao J, Dolan B (2016a) A diversity-promoting objective function for neural conversation models. In: Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies, association for computational linguistics, San Diego, California, pp 110–119, https://doi.org/10.18653/v1/N16-1014

  • Li J, Monroe W, Jurafsky D (2016b) A simple, fast diverse decoding algorithm for neural generation. https://arxiv.org/abs/1611.08562

  • Li J, Monroe W, Ritter A, Jurafsky D, Galley M, Gao J (2016c) Deep reinforcement learning for dialogue generation. In: Proceedings of the 2016 conference on empirical methods in natural language processing, association for computational linguistics, Austin, Texas, pp 1192–1202, https://doi.org/10.18653/v1/D16-1127

  • Li J, Miller AH, Chopra S, Ranzato M, Weston J (2017a) Dialogue learning with human-in-the-loop. In: 5th international conference on learning representations, ICLR 2017, Toulon, France, April 24-26, 2017, conference track proceedings, OpenReview.net, https://openreview.net/forum?id=HJgXCV9xx

  • Li J, Miller AH, Chopra S, Ranzato M, Weston J (2017b) Learning through dialogue interactions by asking questions. In: 5th international conference on learning representations, ICLR 2017, Toulon, France, April 24–26, 2017, Conference Track Proceedings, OpenReview.net, https://openreview.net/forum?id=rkE8pVcle

  • Li J, Monroe W, Shi T, Jean S, Ritter A, Jurafsky D (2017c) Adversarial learning for neural dialogue generation. In: Proceedings of the 2017 conference on empirical methods in natural language processing, association for computational linguistics, Copenhagen, Denmark, pp 2157–2169, https://doi.org/10.18653/v1/D17-1230

  • Li L, Xu C, Wu W, Zhao Y, Zhao X, Tao C (2020b) Zero-resource knowledge-grounded dialogue generation. In: Larochelle H, Ranzato M, Hadsell R, Balcan M, Lin H (eds) Advances in neural information processing systems 33: annual conference on neural information processing systems 2020, NeurIPS 2020, December 6–12, 2020, virtual, https://proceedings.neurips.cc/paper/2020/hash/609c5e5089a9aa967232aba2a4d03114-Abstract.html

  • Li LH, Yatskar M, Yin D, Hsieh CJ, Chang KW (2019a) Visualbert: A simple and performant baseline for vision and language. https://arxiv.org/abs/1908.03557

  • Li M, Roller S, Kulikov I, Welleck S, Boureau YL, Cho K, Weston J (2020c) Don’t say that! making inconsistent dialogue unlikely with unlikelihood training. In: Proceedings of the 58th annual meeting of the association for computational linguistics, association for computational linguistics, Online, pp 4715–4728, https://doi.org/10.18653/v1/2020.acl-main.428

  • Li W, Shao W, Ji S, Cambria E (2022) Bieru: bidirectional emotional recurrent unit for conversational sentiment analysis. Neurocomputing 467:73–82

    Article  Google Scholar 

  • Li X, Lipton ZC, Dhingra B, Li L, Gao J, Chen YN (2016d) A user simulator for task-completion dialogues. https://arxiv.org/abs/1612.05688

  • Li X, Chen YN, Li L, Gao J, Celikyilmaz A (2017d) End-to-end task-completion neural dialogue systems. In: Proceedings of the eighth international joint conference on natural language processing (volume 1: long papers), Asian federation of natural language processing, Taipei, Taiwan, pp 733–743, https://aclanthology.org/I17-1074

  • Li X, Wang Y, Sun S, Panda S, Liu J, Gao J (2018) Microsoft dialogue challenge: building end-to-end task-completion dialogue systems. https://arxiv.org/abs/1807.11125

  • Li X, Yin F, Sun Z, Li X, Yuan A, Chai D, Zhou M, Li J (2019b) Entity-relation extraction as multi-turn question answering. In: Proceedings of the 57th annual meeting of the association for computational linguistics, association for computational linguistics, Florence, Italy, pp 1340–1350, https://doi.org/10.18653/v1/P19-1129

  • Li X, Yin X, Li C, Zhang P, Hu X, Zhang L, Wang L, Hu H, Dong L, Wei F, et al. (2020d) Oscar: Object-semantics aligned pre-training for vision-language tasks. In: European conference on computer vision, Springer, pp 121–137

  • Li Y (2017) Deep reinforcement learning: an overview. https://arxiv.org/abs/1701.07274

  • Li Y, Su H, Shen X, Li W, Cao Z, Niu S (2017e) DailyDialog: A manually labelled multi-turn dialogue dataset. In: Proceedings of the eighth international joint conference on natural language processing (volume 1: long papers), Asian federation of natural language processing, Taipei, Taiwan, pp 986–995, https://aclanthology.org/I17-1099

  • Li Y, Yao K, Qin L, Che W, Li X, Liu T (2020e) Slot-consistent NLG for task-oriented dialogue systems with iterative rectification network. In: Proceedings of the 58th annual meeting of the association for computational linguistics, association for computational linguistics, online, pp 97–106, https://doi.org/10.18653/v1/2020.acl-main.10

  • Li Z, Niu C, Meng F, Feng Y, Li Q, Zhou J (2019c) Incremental transformer with deliberation decoder for document grounded conversations. In: Proceedings of the 57th annual meeting of the association for computational linguistics, association for computational linguistics, Florence, Italy, pp 12–21, https://doi.org/10.18653/v1/P19-1002

  • Liang W, Zou J, Yu Z (2020) Beyond user self-reported Likert scale ratings: a comparison model for automatic dialog evaluation. In: Proceedings of the 58th annual meeting of the association for computational linguistics, association for computational linguistics, online, pp 1363–1374, https://doi.org/10.18653/v1/2020.acl-main.126

  • Lin CY (2004) ROUGE: A package for automatic evaluation of summaries. In: Text summarization branches out, association for computational linguistics, Barcelona, Spain, pp 74–81, https://aclanthology.org/W04-1013

  • Lin LJ (1992) Self-improving reactive agents based on reinforcement learning, planning and teaching. Mach Learn 8(3–4):293–321

    Article  Google Scholar 

  • Lin T, Wang Y, Liu X, Qiu X (2021) A survey of transformers. https://arxiv.org/abs/2106.04554

  • Lin X, Joty S, Jwalapuram P, Bari MS (2019) A unified linear-time framework for sentence-level discourse parsing. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, Florence, Italy, pp 4190–4200, https://doi.org/10.18653/v1/P19-1410

  • Lin X, Jian W, He J, Wang T, Chu W (2020a) Generating informative conversational response using recurrent knowledge-interaction and knowledge-copy. In: Proceedings of the 58th annual meeting of the association for computational linguistics, association for computational linguistics, online, pp 41–52, https://doi.org/10.18653/v1/2020.acl-main.6

  • Lin Y, Liu Z, Sun M, Liu Y, Zhu X (2015) Learning entity and relation embeddings for knowledge graph completion. In: Bonet B, Koenig S (eds) Proceedings of the twenty-ninth AAAI conference on artificial intelligence, january 25–30, 2015, Austin, Texas, USA, AAAI Press, pp 2181–2187, http://www.aaai.org/ocs/index.php/AAAI/AAAI15/paper/view/9571

  • Lin Z, Cai D, Wang Y, Liu X, Zheng H, Shi S (2020b) The world is not binary: Learning to rank with grayscale data for dialogue response selection. In: Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP), association for computational linguistics, online, pp 9220–9229, https://doi.org/10.18653/v1/2020.emnlp-main.741

  • Lin Z, Madotto A, Winata GI, Fung P (2020c) MinTL: Minimalist transfer learning for task-oriented dialogue systems. In: Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP), association for computational linguistics, online, pp 3391–3405, https://doi.org/10.18653/v1/2020.emnlp-main.273

  • Lipton ZC, Berkowitz J, Elkan C (2015) A critical review of recurrent neural networks for sequence learning. https://arxiv.org/abs/1506.00019

  • Lison P, Bibauw S (2017) Not all dialogues are created equal: Instance weighting for neural conversational models. In: Proceedings of the 18th Annual SIGdial Meeting on Discourse and Dialogue, Association for Computational Linguistics, Saarbrücken, Germany, pp 384–394, https://doi.org/10.18653/v1/W17-5546

  • Liu B, Lane I (2017) Iterative policy learning in end-to-end trainable task-oriented neural dialog models. In: 2017 IEEE automatic speech recognition and understanding workshop (ASRU), IEEE, pp 482–489

  • Liu B, Lane IR (2016) Attention-based recurrent neural network models for joint intent detection and slot filling. In: Morgan N (ed) Interspeech 2016, 17th annual conference of the international speech communication association, San Francisco, CA, USA, September 8–12, 2016, ISCA, pp 685–689, https://doi.org/10.21437/Interspeech.2016-1352

  • Liu C, He S, Liu K, Zhao J (2019) Vocabulary pyramid network: Multi-pass encoding and decoding with multi-level vocabularies for response generation. In: Proceedings of the 57th annual meeting of the association for computational linguistics, association for computational linguistics, Florence, Italy, pp 3774–3783, https://doi.org/10.18653/v1/P19-1367

  • Liu CW, Lowe R, Serban I, Noseworthy M, Charlin L, Pineau J (2016) How NOT to evaluate your dialogue system: an empirical study of unsupervised evaluation metrics for dialogue response generation. In: Proceedings of the 2016 conference on empirical methods in natural language processing, association for computational linguistics, Austin, Texas, pp 2122–2132, https://doi.org/10.18653/v1/D16-1230

  • Liu H, Wang W, Wang Y, Liu H, Liu Z, Tang J (2020a) Mitigating gender bias for neural dialogue generation with adversarial learning. In: Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP), association for computational linguistics, online, pp 893–903, https://doi.org/10.18653/v1/2020.emnlp-main.64

  • Liu Q, Chen Y, Chen B, Lou JG, Chen Z, Zhou B, Zhang D (2020b) You impress me: dialogue generation via mutual persona perception. In: Proceedings of the 58th annual meeting of the association for computational linguistics, association for computational linguistics, online, pp 1417–1427, https://doi.org/10.18653/v1/2020.acl-main.131

  • Liu Y, Lapata M (2018) Learning structured text representations. Trans Assoc Comput Linguist 6:63–75

    Article  Google Scholar 

  • Liu Z, Wang H, Niu ZY, Wu H, Che W, Liu T (2020c) Towards conversational recommendation over multi-type dialogs. In: Proceedings of the 58th annual meeting of the association for computational linguistics, association for computational linguistics, online, pp 1036–1049, https://doi.org/10.18653/v1/2020.acl-main.98

  • Lowe R, Pow N, Serban I, Pineau J (2015) The Ubuntu dialogue corpus: A large dataset for research in unstructured multi-turn dialogue systems. In: Proceedings of the 16th annual meeting of the special interest group on discourse and dialogue, association for computational linguistics, Prague, Czech Republic, pp 285–294, https://doi.org/10.18653/v1/W15-4640

  • Lowe R, Noseworthy M, Serban IV, Angelard-Gontier N, Bengio Y, Pineau J (2017) Towards an automatic Turing test: Learning to evaluate dialogue responses. In: Proceedings of the 55th annual meeting of the association for computational linguistics (volume 1: long papers), association for computational linguistics, Vancouver, Canada, pp 1116–1126, https://doi.org/10.18653/v1/P17-1103

  • Lu J, Batra D, Parikh D, Lee S (2019a) Vilbert: Pretraining task-agnostic visiolinguistic representations for vision-and-language tasks. In: Wallach HM, Larochelle H, Beygelzimer A, d’Alché-Buc F, Fox EB, Garnett R (eds) Advances in neural information processing systems 32: annual conference on neural information processing systems 2019, NeurIPS 2019, December 8–14, 2019, Vancouver, BC, Canada, pp 13–23, https://proceedings.neurips.cc/paper/2019/hash/c74d97b01eae257e44aa9d5bade97baf-Abstract.html

  • Lu J, Zhang C, Xie Z, Ling G, Zhou TC, Xu Z (2019b) Constructing interpretive spatio-temporal features for multi-turn responses selection. In: Proceedings of the 57th annual meeting of the association for computational linguistics, association for computational linguistics, Florence, Italy, pp 44–50, https://doi.org/10.18653/v1/P19-1006

  • Lu J, Goswami V, Rohrbach M, Parikh D, Lee S (2020) 12-in-1: Multi-task vision and language representation learning. In: 2020 IEEE/CVF conference on computer vision and pattern recognition, CVPR 2020, Seattle, WA, USA, June 13–19, 2020, IEEE, pp 10434–10443, https://doi.org/10.1109/CVPR42600.2020.01045

  • Lubis N, Sakti S, Yoshino K, Nakamura S (2018) Eliciting positive emotion through affect-sensitive dialogue response generation: A neural network approach. In: McIlraith SA, Weinberger KQ (eds) Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, (AAAI-18), the 30th innovative Applications of Artificial Intelligence (IAAI-18), and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI-18), New Orleans, Louisiana, USA, February 2-7, 2018, AAAI Press, pp 5293–5300, https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/16317

  • Ma MD, Bowden K, Wu J, Cui W, Walker M (2019) Implicit discourse relation identification for open-domain dialogues. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, Florence, Italy, pp 666–672, https://doi.org/10.18653/v1/P19-1065

  • Ma W, Cui Y, Liu T, Wang D, Wang S, Hu G (2020a) Conversational Word Embedding for Retrieval-Based Dialog System. In: Proceedings of the 58th annual meeting of the association for computational linguistics, association for computational linguistics, online, pp 1375–1380, https://doi.org/10.18653/v1/2020.acl-main.127

  • Ma Y, Nguyen KL, Xing FZ, Cambria E (2020) A survey on empathetic dialogue systems. Inf Fusion 64:50–70

    Article  Google Scholar 

  • Madotto A, Lin Z, Wu CS, Fung P (2019) Personalizing dialogue agents via meta-learning. In: Proceedings of the 57th annual meeting of the association for computational linguistics, association for computational linguistics, Florence, Italy, pp 5454–5459, https://doi.org/10.18653/v1/P19-1542

  • Majumder BP, Jhamtani H, Berg-Kirkpatrick T, McAuley J (2020a) Like hiking? You probably enjoy nature: Persona-grounded dialog with commonsense expansions. In: Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP), association for computational linguistics, online, pp 9194–9206, https://doi.org/10.18653/v1/2020.emnlp-main.739

  • Majumder BP, Li S, Ni J, McAuley J (2020b) Interview: Large-scale modeling of media dialog with discourse patterns and knowledge grounding. In: Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP), association for computational linguistics, online, pp 8129–8141, https://doi.org/10.18653/v1/2020.emnlp-main.653

  • Mallios S, Bourbakis N (2016) A survey on human machine dialogue systems. In: 2016 7th international conference on information, intelligence, systems & applications (IISA), IEEE, pp 1–7

  • Manuvirakurike R, Brixey J, Bui T, Chang W, Artstein R, Georgila K (2018) DialEdit: Annotations for spoken conversational image editing. In: Proceedings 14th Joint ACL - ISO Workshop on Interoperable Semantic Annotation, Association for Computational Linguistics, Santa Fe, New Mexico, USA, pp 1–9, https://aclanthology.org/W18-4701

  • Mao HH, Li S, McAuley JJ, Cottrell GW (2020) Speech recognition and multi-speaker diarization of long conversations. In: Meng H, Xu B, Zheng TF (eds) Interspeech 2020, 21st Annual conference of the international speech communication association, virtual event, Shanghai, China, 25–29 October 2020, ISCA, pp 691–695, https://doi.org/10.21437/Interspeech.2020-3039

  • Mehri S, Eskenazi M (2020) USR: An unsupervised and reference free evaluation metric for dialog generation. In: Proceedings of the 58th annual meeting of the association for computational linguistics, association for computational linguistics, online, pp 681–707, https://doi.org/10.18653/v1/2020.acl-main.64

  • Mehri S, Razumovskaia E, Zhao T, Eskenazi M (2019) Pretraining methods for dialog context representation learning. In: Proceedings of the 57th annual meeting of the association for computational linguistics, association for computational linguistics, Florence, Italy, pp 3836–3845, https://doi.org/10.18653/v1/P19-1373

  • Mesgar M, Bücker S, Gurevych I (2020) Dialogue coherence assessment without explicit dialogue act labels. In: Proceedings of the 58th annual meeting of the association for computational linguistics, association for computational linguistics, online, pp 1439–1450, https://doi.org/10.18653/v1/2020.acl-main.133

  • Mesnil G, He X, Deng L, Bengio Y (2013) Investigation of recurrent-neural-network architectures and learning methods for spoken language understanding. In: Interspeech, pp 3771–3775

  • Mesnil G, Dauphin Y, Yao K, Bengio Y, Deng L, Hakkani-Tur D, He X, Heck L, Tur G, Yu D et al (2014) Using recurrent neural networks for slot filling in spoken language understanding. IEEE/ACM Trans Audio Speech Language Process 23(3):530–539

    Article  Google Scholar 

  • Miao N, Zhou H, Mou L, Yan R, Li L (2019) CGMH: constrained sentence generation by metropolis-hastings sampling. In: The Thirty-Third AAAI Conference on Artificial Intelligence, AAAI 2019, The Thirty-First Innovative Applications of Artificial Intelligence Conference, IAAI 2019, The Ninth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2019, Honolulu, Hawaii, USA, January 27 - February 1, 2019, AAAI Press, pp 6834–6842, https://doi.org/10.1609/aaai.v33i01.33016834

  • Miech A, Alayrac J, Smaira L, Laptev I, Sivic J, Zisserman A (2020) End-to-end learning of visual representations from uncurated instructional videos. In: 2020 IEEE/CVF conference on computer vision and pattern recognition, CVPR 2020, Seattle, WA, USA, June 13–19, 2020, IEEE, pp 9876–9886, https://doi.org/10.1109/CVPR42600.2020.00990

  • Miller A, Fisch A, Dodge J, Karimi AH, Bordes A, Weston J (2016) Key-value memory networks for directly reading documents. In: Proceedings of the 2016 conference on empirical methods in natural language processing, association for computational linguistics, Austin, Texas, pp 1400–1409, https://doi.org/10.18653/v1/D16-1147

  • Miltsakaki E, Prasad R, Joshi A, Webber B (2004) The Penn Discourse Treebank. In: Proceedings of the fourth international conference on language resources and evaluation (LREC’04), European Language Resources Association (ELRA), Lisbon, Portugal, http://www.lrec-conf.org/proceedings/lrec2004/pdf/618.pdf

  • Mirowski P, Pascanu R, Viola F, Soyer H, Ballard A, Banino A, Denil M, Goroshin R, Sifre L, Kavukcuoglu K, Kumaran D, Hadsell R (2017) Learning to navigate in complex environments. In: 5th international conference on learning representations, ICLR 2017, Toulon, France, April 24–26, 2017, Conference Track Proceedings, OpenReview.net, https://openreview.net/forum?id=SJMGPrcle

  • Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Bellemare MG, Graves A, Riedmiller M, Fidjeland AK, Ostrovski G et al (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529–533

    Article  Google Scholar 

  • Mnih V, Badia AP, Mirza M, Graves A, Lillicrap TP, Harley T, Silver D, Kavukcuoglu K (2016) Asynchronous methods for deep reinforcement learning. In: Balcan M, Weinberger KQ (eds) Proceedings of the 33nd international conference on machine learning, ICML 2016, New York City, NY, USA, June 19–24, 2016, JMLR.org, JMLR Workshop and Conference Proceedings, vol 48, pp 1928–1937, http://proceedings.mlr.press/v48/mniha16.html

  • Mo K, Zhang Y, Li S, Li J, Yang Q (2018) Personalizing a dialogue system with transfer reinforcement learning. In: McIlraith SA, Weinberger KQ (eds) Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, (AAAI-18), the 30th innovative Applications of Artificial Intelligence (IAAI-18), and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI-18), New Orleans, Louisiana, USA, February 2–7, 2018, AAAI Press, pp 5317–5324, https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/16104

  • Moghe N, Arora S, Banerjee S, Khapra MM (2018) Towards exploiting background knowledge for building conversation systems. In: Proceedings of the 2018 conference on empirical methods in natural language processing, association for computational linguistics, Brussels, Belgium, pp 2322–2332, https://doi.org/10.18653/v1/D18-1255

  • Mohammad S, Bravo-Marquez F, Salameh M, Kiritchenko S (2018) SemEval-2018 task 1: Affect in tweets. In: Proceedings of the 12th international workshop on semantic evaluation, association for computational linguistics, New Orleans, Louisiana, pp 1–17, https://doi.org/10.18653/v1/S18-1001

  • Moon S, Shah P, Kumar A, Subba R (2019) OpenDialKG: Explainable conversational reasoning with attention-based walks over knowledge graphs. In: Proceedings of the 57th annual meeting of the association for computational linguistics, association for computational linguistics, Florence, Italy, pp 845–854, https://doi.org/10.18653/v1/P19-1081

  • Mostafazadeh N, Brockett C, Dolan B, Galley M, Gao J, Spithourakis G, Vanderwende L (2017) Image-grounded conversations: multimodal context for natural question and response generation. In: Proceedings of the eighth international joint conference on natural language processing (volume 1: long papers), Asian Federation of Natural Language Processing, Taipei, Taiwan, pp 462–472, https://aclanthology.org/I17-1047

  • Mrkšić N, Ó Séaghdha D, Thomson B, Gašić M, Su PH, Vandyke D, Wen TH, Young S (2015) Multi-domain dialog state tracking using recurrent neural networks. In: Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing (volume 2: short papers), association for computational linguistics, Beijing, China, pp 794–799, https://doi.org/10.3115/v1/P15-2130

  • Mrkšić N, Ó Séaghdha D, Wen TH, Thomson B, Young S (2017) Neural belief tracker: data-driven dialogue state tracking. In: Proceedings of the 55th annual meeting of the association for computational linguistics (volume 1: long papers), association for computational linguistics, Vancouver, Canada, pp 1777–1788, https://doi.org/10.18653/v1/P17-1163

  • Nakov P, Màrquez L, Magdy W, Moschitti A, Glass J, Randeree B (2015) SemEval-2015 task 3: Answer selection in community question answering. In: Proceedings of the 9th international workshop on semantic evaluation (SemEval 2015), association for computational linguistics, Denver, Colorado, pp 269–281, https://doi.org/10.18653/v1/S15-2047

  • Ni J, Pandelea V, Young T, Zhou H, Cambria E (2022) Hitkg: Towards goal-oriented conversations via multi-hierarchy learning. Proceedings of the AAAI conference on artificial intelligence 36:11112–11120

    Article  Google Scholar 

  • Nickel M, Rosasco L, Poggio TA (2016) Holographic embeddings of knowledge graphs. In: Schuurmans D, Wellman MP (eds) Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, February 12–17, 2016, Phoenix, Arizona, USA, AAAI Press, pp 1955–1961, http://www.aaai.org/ocs/index.php/AAAI/AAAI16/paper/view/12484

  • Novikova J, Dušek O, Rieser V (2017) The E2E dataset: new challenges for end-to-end generation. In: Proceedings of the 18th annual sigdial meeting on discourse and dialogue, association for computational linguistics, Saarbrücken, Germany, pp 201–206, https://doi.org/10.18653/v1/W17-5525

  • Obuchowski A, Lew M (2020) Transformer-capsule model for intent detection (student abstract). In: The thirty-fourth AAAI conference on artificial intelligence, AAAI 2020, the thirty-second innovative applications of artificial intelligence conference, IAAI 2020, the tenth AAAI symposium on educational advances in artificial intelligence, EAAI 2020, New York, NY, USA, February 7–12, 2020, AAAI Press, pp 13885–13886, https://aaai.org/ojs/index.php/AAAI/article/view/7215

  • Oraby S, Harrison V, Ebrahimi A, Walker M (2019) Curate and generate: a corpus and method for joint control of semantics and style in neural NLG. In: Proceedings of the 57th annual meeting of the association for computational linguistics, association for computational linguistics, Florence, Italy, pp 5938–5951, https://doi.org/10.18653/v1/P19-1596

  • Ouyang Y, Chen M, Dai X, Zhao Y, Huang S, Chen J (2020) Dialogue state tracking with explicit slot connection modeling. In: Proceedings of the 58th annual meeting of the association for computational linguistics, association for computational linguistics, online, pp 34–40, https://doi.org/10.18653/v1/2020.acl-main.5

  • Panayotov V, Chen G, Povey D, Khudanpur S (2015) Librispeech: An ASR corpus based on public domain audio books. In: 2015 IEEE international conference on acoustics, speech and signal processing, ICASSP 2015, South Brisbane, Queensland, Australia, April 19–24, 2015, IEEE, pp 5206–5210, https://doi.org/10.1109/ICASSP.2015.7178964

  • Pang B, Nijkamp E, Han W, Zhou L, Liu Y, Tu K (2020) Towards holistic and automatic evaluation of open-domain dialogue generation. In: Proceedings of the 58th annual meeting of the association for computational linguistics, association for computational linguistics, online, pp 3619–3629, https://doi.org/10.18653/v1/2020.acl-main.333

  • Papineni K, Roukos S, Ward T, Zhu WJ (2002) Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting of the association for computational linguistics, association for computational linguistics, Philadelphia, Pennsylvania, USA, pp 311–318, https://doi.org/10.3115/1073083.1073135

  • Parikh A, Täckström O, Das D, Uszkoreit J (2016) A decomposable attention model for natural language inference. In: Proceedings of the 2016 conference on empirical methods in natural language processing, association for computational linguistics, Austin, Texas, pp 2249–2255, https://doi.org/10.18653/v1/D16-1244

  • Pascanu R, Mikolov T, Bengio Y (2013) On the difficulty of training recurrent neural networks. In: Proceedings of the 30th international conference on machine learning, ICML 2013, Atlanta, GA, USA, 16-21 June 2013, JMLR.org, JMLR workshop and conference proceedings, vol 28, pp 1310–1318, http://proceedings.mlr.press/v28/pascanu13.html

  • Peng B, Li X, Li L, Gao J, Celikyilmaz A, Lee S, Wong KF (2017) Composite task-completion dialogue policy learning via hierarchical deep reinforcement learning. In: Proceedings of the 2017 conference on empirical methods in natural language processing, association for computational linguistics, Copenhagen, Denmark, pp 2231–2240, https://doi.org/10.18653/v1/D17-1237

  • Peters ME, Neumann M, Iyyer M, Gardner M, Clark C, Lee K, Zettlemoyer L (2018) Deep contextualized word representations. In: Proceedings of the 2018 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 1 (long papers), association for computational linguistics, New Orleans, Louisiana, pp 2227–2237, https://doi.org/10.18653/v1/N18-1202

  • Pfau D, Vinyals O (2016) Connecting generative adversarial networks and actor-critic methods

  • Poria S, Hazarika D, Majumder N, Naik G, Cambria E, Mihalcea R (2019) MELD: A multimodal multi-party dataset for emotion recognition in conversations. In: Proceedings of the 57th annual meeting of the association for computational linguistics, association for computational linguistics, Florence, Italy, pp 527–536, https://doi.org/10.18653/v1/P19-1050

  • Powers DMW (2020) Evaluation: from precision, recall and f-measure to roc, informedness, markedness and correlation

  • Puterman ML (2014) Markov decision processes: discrete stochastic dynamic programming. Wiley, Hoboken

    MATH  Google Scholar 

  • Qi D, Su L, Song J, Cui E, Bharti T, Sacheti A (2020) Imagebert: cross-modal pre-training with large-scale weak-supervised image-text data. https://arxiv.org/abs/2001.07966

  • Qian K, Yu Z (2019) Domain adaptive dialog generation via meta learning. In: Proceedings of the 57th annual meeting of the association for computational linguistics, association for computational linguistics, Florence, Italy, pp 2639–2649, https://doi.org/10.18653/v1/P19-1253

  • Qin L, Che W, Li Y, Wen H, Liu T (2019) A stack-propagation framework with token-level intent detection for spoken language understanding. In: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP), association for computational linguistics, Hong Kong, China, pp 2078–2087, https://doi.org/10.18653/v1/D19-1214

  • Qin L, Xu X, Che W, Zhang Y, Liu T (2020) Dynamic fusion network for multi-domain end-to-end task-oriented dialog. In: Proceedings of the 58th annual meeting of the association for computational linguistics, association for computational linguistics, online, pp 6344–6354, https://doi.org/10.18653/v1/2020.acl-main.565

  • Qiu L, Li J, Bi W, Zhao D, Yan R (2019) Are training samples correlated? Learning to generate dialogue responses with multiple references. In: Proceedings of the 57th annual meeting of the association for computational linguistics, association for computational linguistics, Florence, Italy, pp 3826–3835, https://doi.org/10.18653/v1/P19-1372

  • Qiu L, Zhao Y, Shi W, Liang Y, Shi F, Yuan T, Yu Z, Zhu SC (2020) Structured attention for unsupervised dialogue structure induction. In: Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP), association for computational linguistics, online, pp 1889–1899, https://doi.org/10.18653/v1/2020.emnlp-main.148

  • Qiu M, Li FL, Wang S, Gao X, Chen Y, Zhao W, Chen H, Huang J, Chu W (2017) AliMe chat: A sequence to sequence and rerank based chatbot engine. In: Proceedings of the 55th annual meeting of the association for computational linguistics (volume 2: short papers), association for computational linguistics, Vancouver, Canada, pp 498–503, https://doi.org/10.18653/v1/P17-2079

  • Quan J, Xiong D (2020) Modeling long context for task-oriented dialogue state generation. In: Proceedings of the 58th annual meeting of the association for computational linguistics, association for computational linguistics, online, pp 7119–7124, https://doi.org/10.18653/v1/2020.acl-main.637

  • Quan J, Zhang S, Cao Q, Li Z, Xiong D (2020) RiSAWOZ: A large-scale multi-domain Wizard-of-Oz dataset with rich semantic annotations for task-oriented dialogue modeling. In: Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP), association for computational linguistics, online, pp 930–940, https://doi.org/10.18653/v1/2020.emnlp-main.67

  • Rajpurkar P, Jia R, Liang P (2018) Know what you don’t know: Unanswerable questions for SQuAD. In: Proceedings of the 56th annual meeting of the association for computational linguistics (volume 2: short papers), association for computational linguistics, Melbourne, Australia, pp 784–789, https://doi.org/10.18653/v1/P18-2124

  • Ram A, Prasad R, Khatri C, Venkatesh A, Gabriel R, Liu Q, Nunn J, Hedayatnia B, Cheng M, Nagar A, et al. (2018) Conversational ai: the science behind the alexa prize. https://arxiv.org/abs/1801.03604

  • Rameshkumar R, Bailey P (2020) Storytelling with dialogue: A Critical Role Dungeons and Dragons Dataset. In: Proceedings of the 58th annual meeting of the association for computational linguistics, association for computational linguistics, online, pp 5121–5134, https://doi.org/10.18653/v1/2020.acl-main.459

  • Ramshaw L, Marcus M (1995) Text chunking using transformation-based learning. In: Third workshop on very large corpora, https://aclanthology.org/W95-0107

  • Rashkin H, Smith EM, Li M, Boureau YL (2019) Towards empathetic open-domain conversation models: A new benchmark and dataset. In: Proceedings of the 57th annual meeting of the association for computational linguistics, association for computational linguistics, Florence, Italy, pp 5370–5381, https://doi.org/10.18653/v1/P19-1534

  • Rastogi A, Zang X, Sunkara S, Gupta R, Khaitan P (2020) Towards scalable multi-domain conversational agents: the schema-guided dialogue dataset. In: The Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2020, The Thirty-Second Innovative Applications of Artificial Intelligence Conference, IAAI 2020, The Tenth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2020, New York, NY, USA, February 7–12, 2020, AAAI Press, pp 8689–8696, https://aaai.org/ojs/index.php/AAAI/article/view/6394

  • Ravuri S, Stolcke A (2015) Recurrent neural network and lstm models for lexical utterance classification. In: Sixteenth annual conference of the international speech communication association

  • Ravuri SV, Stolcke A (2016) A comparative study of recurrent neural network models for lexical domain classification. In: 2016 IEEE international conference on acoustics, speech and signal processing, ICASSP 2016, Shanghai, China, March 20–25, 2016, IEEE, pp 6075–6079, https://doi.org/10.1109/ICASSP.2016.7472844

  • Rawat W, Wang Z (2017) Deep convolutional neural networks for image classification: a comprehensive review. Neural Comput 29(9):2352–2449

    Article  MathSciNet  MATH  Google Scholar 

  • Reiter E (1994) Has a consensus NL generation architecture appeared, and is it psycholinguistically plausible? In: Proceedings of the Seventh International Workshop on Natural Language Generation, https://aclanthology.org/W94-0319

  • Ren H, Xu W, Zhang Y, Yan Y (2013) Dialog state tracking using conditional random fields. In: Proceedings of the SIGDIAL 2013 conference, association for computational linguistics, Metz, France, pp 457–461, https://aclanthology.org/W13-4071

  • Ren L, Xie K, Chen L, Yu K (2018) Towards universal dialogue state tracking. In: Proceedings of the 2018 conference on empirical methods in natural language processing, association for computational linguistics, Brussels, Belgium, pp 2780–2786, https://doi.org/10.18653/v1/D18-1299

  • Ren P, Chen Z, Ren Z, Kanoulas E, Monz C, de Rijke M (2020) Conversations with search engines. https://arxiv.org/abs/2004.14162

  • Ritter A, Cherry C, Dolan WB (2011) Data-driven response generation in social media. In: Proceedings of the 2011 conference on empirical methods in natural language processing, association for computational linguistics, Edinburgh, Scotland, UK. pp 583–593, https://aclanthology.org/D11-1054

  • Rodriguez P, Crook P, Moon S, Wang Z (2020) Information seeking in the spirit of learning: a dataset for conversational curiosity. In: Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP), association for computational linguistics, online, pp 8153–8172, https://doi.org/10.18653/v1/2020.emnlp-main.655

  • Saha A, Khapra MM, Sankaranarayanan K (2018) Towards building large scale multimodal domain-aware conversation systems. In: McIlraith SA, Weinberger KQ (eds) Proceedings of the thirty-second AAAI conference on artificial intelligence, (AAAI-18), the 30th innovative applications of artificial intelligence (IAAI-18), and the 8th AAAI symposium on educational advances in artificial intelligence (EAAI-18), New Orleans, Louisiana, USA, February 2–7, 2018, AAAI Press, pp 696–704, https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/17104

  • Saha T, Patra A, Saha S, Bhattacharyya P (2020) Towards emotion-aided multi-modal dialogue act classification. In: Proceedings of the 58th annual meeting of the association for computational linguistics, association for computational linguistics, online, pp 4361–4372, https://doi.org/10.18653/v1/2020.acl-main.402

  • Sankar C, Subramanian S, Pal C, Chandar S, Bengio Y (2019) Do neural dialog systems use the conversation history effectively? An empirical study. In: Proceedings of the 57th annual meeting of the association for computational linguistics, association for computational linguistics, Florence, Italy, pp 32–37, https://doi.org/10.18653/v1/P19-1004

  • Santhanam S, Shaikh S (2019) A survey of natural language generation techniques with a focus on dialogue systems-past, present and future directions. https://arxiv.org/abs/1906.00500

  • Sarikaya R, Hinton GE, Ramabhadran B (2011) Deep belief nets for natural language call-routing. In: 2011 IEEE International conference on acoustics, speech and signal processing (ICASSP), IEEE, pp 5680–5683

  • Sarikaya R, Hinton GE, Deoras A (2014) Application of deep belief networks for natural language understanding. IEEE/ACM Trans Audio Speech Lang Process 22(4):778–784

    Article  Google Scholar 

  • Sato S, Akama R, Ouchi H, Suzuki J, Inui K (2020) Evaluating dialogue generation systems via response selection. In: Proceedings of the 58th annual meeting of the association for computational linguistics, association for computational linguistics, online, pp 593–599, https://doi.org/10.18653/v1/2020.acl-main.55

  • Schatzmann J, Young S (2009) The hidden agenda user simulation model. IEEE/ACM Trans Audio Speech Lang Process 17(4):733–747

    Article  Google Scholar 

  • Schuster M, Paliwal KK (1997) Bidirectional recurrent neural networks. IEEE Trans Signal Process 45(11):2673–2681

    Article  Google Scholar 

  • See A, Roller S, Kiela D, Weston J (2019) What makes a good conversation? how controllable attributes affect human judgments. In: Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 1 (long and short papers), association for computational linguistics, Minneapolis, Minnesota, pp 1702–1723, https://doi.org/10.18653/v1/N19-1170

  • Serban IV, Sordoni A, Bengio Y, Courville AC, Pineau J (2016) Building end-to-end dialogue systems using generative hierarchical neural network models. In: Schuurmans D, Wellman MP (eds) Proceedings of the thirtieth AAAI conference on artificial intelligence, February 12–17, 2016, Phoenix, Arizona, USA, AAAI Press, pp 3776–3784, http://www.aaai.org/ocs/index.php/AAAI/AAAI16/paper/view/11957

  • Serban IV, Sankar C, Germain M, Zhang S, Lin Z, Subramanian S, Kim T, Pieper M, Chandar S, Ke NR, et al. (2017a) A deep reinforcement learning chatbot. https://arxiv.org/abs/1709.02349

  • Serban IV, Sordoni A, Lowe R, Charlin L, Pineau J, Courville AC, Bengio Y (2017b) A hierarchical latent variable encoder-decoder model for generating dialogues. In: Singh SP, Markovitch S (eds) Proceedings of the thirty-first AAAI conference on artificial intelligence, February 4–9, 2017, San Francisco, California, USA, AAAI Press, pp 3295–3301, http://aaai.org/ocs/index.php/AAAI/AAAI17/paper/view/14567

  • Serras M, Torres MI, del Pozo A (2019) Goal-conditioned user modeling for dialogue systems using stochastic bi-automata. In: ICPRAM, pp 128–134

  • Shah P, Hakkani-Tür D, Tür G, Rastogi A, Bapna A, Nayak N, Heck L (2018) Building a conversational agent overnight with dialogue self-play. https://arxiv.org/abs/1801.04871

  • Shan Y, Li Z, Zhang J, Meng F, Feng Y, Niu C, Zhou J (2020) A contextual hierarchical attention network with adaptive objective for dialogue state tracking. In: Proceedings of the 58th annual meeting of the association for computational linguistics, association for computational linguistics, online, pp 6322–6333, https://doi.org/10.18653/v1/2020.acl-main.563

  • Shang L, Lu Z, Li H (2015) Neural responding machine for short-text conversation. In: Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing (volume 1: long papers), association for computational linguistics, Beijing, China, pp 1577–1586, https://doi.org/10.3115/v1/P15-1152

  • Shao L, Gouws S, Britz D, Goldie A, Strope B, Kurzweil R (2017) Generating long and diverse responses with neural conversation models. https://arxiv.org/abs/1701.03185

  • Shao Y, Nakashole N (2020) ChartDialogs: Plotting from Natural Language Instructions. In: Proceedings of the 58th annual meeting of the association for computational linguistics, association for computational linguistics, online, pp 3559–3574, https://doi.org/10.18653/v1/2020.acl-main.328

  • Shen L, Feng Y, Zhan H (2019) Modeling semantic relationship in multi-turn conversations with hierarchical latent variables. In: Proceedings of the 57th annual meeting of the association for computational linguistics, association for computational linguistics, Florence, Italy, pp 5497–5502, https://doi.org/10.18653/v1/P19-1549

  • Shi B, Weninger T (2017) Proje: Embedding projection for knowledge graph completion. In: Singh SP, Markovitch S (eds) Proceedings of the thirty-first AAAI conference on artificial intelligence, February 4–9, 2017, San Francisco, California, USA, AAAI Press, pp 1236–1242, http://aaai.org/ocs/index.php/AAAI/AAAI17/paper/view/14279

  • Shuster K, Humeau S, Bordes A, Weston J (2020a) Image-chat: Engaging grounded conversations. In: Proceedings of the 58th annual meeting of the association for computational linguistics, association for computational linguistics, online, pp 2414–2429, https://doi.org/10.18653/v1/2020.acl-main.219

  • Shuster K, Humeau S, Bordes A, Weston J (2020b) Image-chat: engaging grounded conversations. In: Proceedings of the 58th annual meeting of the association for computational linguistics, association for computational linguistics, online, pp 2414–2429, https://doi.org/10.18653/v1/2020.acl-main.219

  • Shuster K, Ju D, Roller S, Dinan E, Boureau YL, Weston J (2020c) The dialogue dodecathlon: Open-domain knowledge and image grounded conversational agents. In: Proceedings of the 58th annual meeting of the association for computational linguistics, association for computational linguistics, online, pp 2453–2470, https://doi.org/10.18653/v1/2020.acl-main.222

  • Siddharthan A (2001) Ehud reiter and robert dale. Building natural language generation systems. Natural Lang Eng 7(3):271

  • Silver D, Huang A, Maddison CJ, Guez A, Sifre L, Van Den Driessche G, Schrittwieser J, Antonoglou I, Panneershelvam V, Lanctot M et al (2016) Mastering the game of go with deep neural networks and tree search. Nature 529(7587):484–489

    Article  Google Scholar 

  • Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: Bengio Y, LeCun Y (eds) 3rd international conference on learning representations, ICLR 2015, San Diego, CA, USA, May 7–9, 2015, Conference Track Proceedings, http://arxiv.org/abs/1409.1556

  • Singh A, Goswami V, Parikh D (2020) Are we pretraining it right? Digging deeper into visio-linguistic pretraining. https://arxiv.org/abs/2004.08744

  • Singh S, Litman D, Kearns M, Walker M (2002) Optimizing dialogue management with reinforcement learning: experiments with the njfun system. J Artif Intell Res 16:105–133

    Article  MATH  Google Scholar 

  • Singla K, Chen Z, Atkins D, Narayanan S (2020) Towards end-2-end learning for predicting behavior codes from spoken utterances in psychotherapy conversations. In: Proceedings of the 58th annual meeting of the association for computational linguistics, association for computational linguistics, online, pp 3797–3803, https://doi.org/10.18653/v1/2020.acl-main.351

  • Sinha K, Parthasarathi P, Wang J, Lowe R, Hamilton WL, Pineau J (2020) Learning an unreferenced metric for online dialogue evaluation. In: Proceedings of the 58th annual meeting of the association for computational linguistics, association for computational linguistics, online, pp 2430–2441, https://doi.org/10.18653/v1/2020.acl-main.220

  • Smith EM, Williamson M, Shuster K, Weston J, Boureau YL (2020) Can you put it all together: Evaluating conversational agents’ ability to blend skills. In: Proceedings of the 58th annual meeting of the association for computational linguistics, association for computational linguistics, online, pp 2021–2030, https://doi.org/10.18653/v1/2020.acl-main.183

  • Socher R, Chen D, Manning CD, Ng AY (2013) Reasoning with neural tensor networks for knowledge base completion. In: Burges CJC, Bottou L, Ghahramani Z, Weinberger KQ (eds) Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5–8, 2013, Lake Tahoe, Nevada, United States, pp 926–934, https://proceedings.neurips.cc/paper/2013/hash/b337e84de8752b27eda3a12363109e80-Abstract.html

  • Song H, Wang Y, Zhang WN, Liu X, Liu T (2020a) Generate, delete and rewrite: A three-stage framework for improving persona consistency of dialogue generation. In: Proceedings of the 58th annual meeting of the association for computational linguistics, association for computational linguistics, online, pp 5821–5831, https://doi.org/10.18653/v1/2020.acl-main.516

  • Song H, Wang Y, Zhang WN, Zhao Z, Liu T, Liu X (2020b) Profile consistency identification for open-domain dialogue agents. In: Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP), association for computational linguistics, online, pp 6651–6662, https://doi.org/10.18653/v1/2020.emnlp-main.539

  • Song Y, Yan R, Li X, Zhao D, Zhang M (2016) Two are better than one: an ensemble of retrieval-and generation-based dialog systems

  • Song Z, Zheng X, Liu L, Xu M, Huang X (2019) Generating responses with a specific emotion in dialog. In: Proceedings of the 57th annual meeting of the association for computational linguistics, association for computational linguistics, Florence, Italy, pp 3685–3695, https://doi.org/10.18653/v1/P19-1359

  • Sordoni A, Bengio Y, Vahabi H, Lioma C, Simonsen JG, Nie J (2015a) A hierarchical recurrent encoder-decoder for generative context-aware query suggestion. In: Bailey J, Moffat A, Aggarwal CC, de Rijke M, Kumar R, Murdock V, Sellis TK, Yu JX (eds) Proceedings of the 24th ACM international conference on information and knowledge management, CIKM 2015, Melbourne, VIC, Australia, October 19–23, 2015, ACM, pp 553–562, https://doi.org/10.1145/2806416.2806493

  • Sordoni A, Galley M, Auli M, Brockett C, Ji Y, Mitchell M, Nie JY, Gao J, Dolan B (2015b) A neural network approach to context-sensitive generation of conversational responses. In: Proceedings of the 2015 conference of the North American chapter of the association for computational linguistics: human language technologies, association for computational linguistics, Denver, Colorado, pp 196–205, https://doi.org/10.3115/v1/N15-1020

  • Stasaski K, Yang GH, Hearst MA (2020) More diverse dialogue datasets via diversity-informed data collection. In: Proceedings of the 58th annual meeting of the association for computational linguistics, association for computational linguistics, online, pp 4958–4968, https://doi.org/10.18653/v1/2020.acl-main.446

  • Stent A, Marge M, Singhai M (2005) Evaluating evaluation methods for generation in the presence of variation. In: International conference on intelligent text processing and computational linguistics, Springer, pp 341–351

  • Su H, Shen X, Zhang R, Sun F, Hu P, Niu C, Zhou J (2019a) Improving multi-turn dialogue modelling with utterance ReWriter. In: Proceedings of the 57th annual meeting of the association for computational linguistics, association for computational linguistics, Florence, Italy, pp 22–31, https://doi.org/10.18653/v1/P19-1003

  • Su H, Shen X, Zhao S, Xiao Z, Hu P, Zhong R, Niu C, Zhou J (2020a) Diversifying dialogue generation with non-conversational text. In: Proceedings of the 58th annual meeting of the association for computational linguistics, association for computational linguistics, online, pp 7087–7097, https://doi.org/10.18653/v1/2020.acl-main.634

  • Su PH, Vandyke D, Gasic M, Kim D, Mrksic N, Wen TH, Young S (2015) Learning from real users: Rating dialogue success with neural networks for reinforcement learning in spoken dialogue systems. https://arxiv.org/abs/1508.03386

  • Su PH, Gasic M, Mrksic N, Rojas-Barahona L, Ultes S, Vandyke D, Wen TH, Young S (2016) Continuously learning neural dialogue management. https://arxiv.org/abs/1606.02689

  • Su SY, Huang CW, Chen YN (2019b) Dual supervised learning for natural language understanding and generation. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, Florence, Italy, pp 5472–5477, https://doi.org/10.18653/v1/P19-1545

  • Su W, Zhu X, Cao Y, Li B, Lu L, Wei F, Dai J (2020b) VL-BERT: pre-training of generic visual-linguistic representations. In: 8th international conference on learning representations, ICLR 2020, Addis Ababa, Ethiopia, April 26–30, 2020, OpenReview.net, https://openreview.net/forum?id=SygXPaEYvH

  • Sukhbaatar S, Szlam A, Weston J, Fergus R (2015) End-to-end memory networks. In: Cortes C, Lawrence ND, Lee DD, Sugiyama M, Garnett R (eds) Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, December 7–12, 2015, Montreal, Quebec, Canada, pp 2440–2448, https://proceedings.neurips.cc/paper/2015/hash/8fb21ee7a2207526da55a679f0332de2-Abstract.html

  • Sun C, Baradel F, Murphy K, Schmid C (2019a) Learning video representations using contrastive bidirectional transformer. https://arxiv.org/abs/1906.05743

  • Sun C, Myers A, Vondrick C, Murphy K, Schmid C (2019b) Videobert: a joint model for video and language representation learning. In: 2019 IEEE/CVF international conference on computer vision, ICCV 2019, Seoul, Korea (South), October 27–November 2, 2019, IEEE, pp 7463–7472, https://doi.org/10.1109/ICCV.2019.00756

  • Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks. In: Ghahramani Z, Welling M, Cortes C, Lawrence ND, Weinberger KQ (eds) Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, December 8–13 2014, Montreal, Quebec, Canada, pp 3104–3112, https://proceedings.neurips.cc/paper/2014/hash/a14ac55a4f27472c5d894ec1c3c743d2-Abstract.html

  • Sutton RS (1988) Learning to predict by the methods of temporal differences. Mach Learn 3(1):9–44

    Article  Google Scholar 

  • Sutton RS, McAllester DA, Singh SP, Mansour Y et al (1999) Policy gradient methods for reinforcement learning with function approximation. NIPs, Citeseer 99:1057–1063

    Google Scholar 

  • Szegedy C, Liu W, Jia Y, Sermanet P, Reed SE, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: IEEE conference on computer vision and pattern recognition, CVPR 2015, Boston, MA, USA, June 7–12, 2015, IEEE Computer Society, pp 1–9, https://doi.org/10.1109/CVPR.2015.7298594

  • Takanobu R, Liang R, Huang M (2020) Multi-agent task-oriented dialog policy learning with role-aware reward decomposition. In: Proceedings of the 58th annual meeting of the association for computational linguistics, association for computational linguistics, online, pp 625–638, https://doi.org/10.18653/v1/2020.acl-main.59

  • Takmaz E, Giulianelli M, Pezzelle S, Sinclair A, Fernández R (2020) Refer, reuse, reduce: generating subsequent references in visual and conversational contexts. In: Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP), association for computational linguistics, online, pp 4350–4368, https://doi.org/10.18653/v1/2020.emnlp-main.353

  • Tamar A, Levine S, Abbeel P, Wu Y, Thomas G (2016) Value iteration networks. In: Lee DD, Sugiyama M, von Luxburg U, Guyon I, Garnett R (eds) Advances in neural information processing systems 29: annual conference on neural information processing systems 2016, December 5–10, 2016, Barcelona, Spain, pp 2146–2154, https://proceedings.neurips.cc/paper/2016/hash/c21002f464c5fc5bee3b98ced83963b8-Abstract.html

  • Tan H, Bansal M (2019) LXMERT: Learning cross-modality encoder representations from transformers. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Association for Computational Linguistics, Hong Kong, China, pp 5100–5111, https://doi.org/10.18653/v1/D19-1514

  • Tanana M, Hallgren KA, Imel ZE, Atkins DC, Srikumar V (2016) A comparison of natural language processing methods for automated coding of motivational interviewing. J Subst Abuse Treatment 65:43–50

    Article  Google Scholar 

  • Tang D, Qin B, Liu T (2015) Learning semantic representations of users and products for document level sentiment classification. In: Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing (volume 1: long papers), association for computational linguistics, Beijing, China, pp 1014–1023, https://doi.org/10.3115/v1/P15-1098

  • Tang J, Zhao T, Xiong C, Liang X, Xing E, Hu Z (2019) Target-guided open-domain conversation. In: Proceedings of the 57th annual meeting of the association for computational linguistics, association for computational linguistics, Florence, Italy, pp 5624–5634, https://doi.org/10.18653/v1/P19-1565

  • Tao C, Mou L, Zhao D, Yan R (2018) RUBER: an unsupervised method for automatic evaluation of open-domain dialog systems. In: McIlraith SA, Weinberger KQ (eds) Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, (AAAI-18), the 30th innovative Applications of Artificial Intelligence (IAAI-18), and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI-18), New Orleans, Louisiana, USA, February 2–7, 2018, AAAI Press, pp 722–729, https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/16179

  • Tao C, Wu W, Xu C, Hu W, Zhao D, Yan R (2019) One time of interaction may not be enough: Go deep with an interaction-over-interaction network for response selection in dialogues. In: Proceedings of the 57th annual meeting of the association for computational linguistics, association for computational linguistics, Florence, Italy, pp 1–11, https://doi.org/10.18653/v1/P19-1001

  • Tay Y, Wang S, Luu AT, Fu J, Phan MC, Yuan X, Rao J, Hui SC, Zhang A (2019) Simple and effective curriculum pointer-generator networks for reading comprehension over long narratives. In: Proceedings of the 57th annual meeting of the association for computational linguistics, association for computational linguistics, Florence, Italy, pp 4922–4931, https://doi.org/10.18653/v1/P19-1486

  • Tay Y, Dehghani M, Bahri D, Metzler D (2020) Efficient transformers: a survey. https://arxiv.org/abs/2009.06732

  • Theune M (2003) Natural language generation for dialogue: system survey. University of Twente, Centre for Telematics and Information Technology

    Google Scholar 

  • Thomas M, Pang B, Lee L (2006) Get out the vote: Determining support or opposition from congressional floor-debate transcripts. In: Proceedings of the 2006 conference on empirical methods in natural language processing, association for computational linguistics, Sydney, Australia, pp 327–335, https://aclanthology.org/W06-1639

  • Tian Z, Bi W, Li X, Zhang NL (2019) Learning to abstract for memory-augmented conversational response generation. In: Proceedings of the 57th annual meeting of the association for computational linguistics, association for computational linguistics, Florence, Italy, pp 3816–3825, https://doi.org/10.18653/v1/P19-1371

  • Tiedemann J (2012) Parallel data, tools and interfaces in OPUS. In: Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC’12), European Language Resources Association (ELRA), Istanbul, Turkey, pp 2214–2218, http://www.lrec-conf.org/proceedings/lrec2012/pdf/463_Paper.pdf

  • Tonelli S, Riccardi G, Prasad R, Joshi A (2010) Annotation of discourse relations for conversational spoken dialogs. In: Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC’10), European Language Resources Association (ELRA), Valletta, Malta, http://www.lrec-conf.org/proceedings/lrec2010/pdf/184_Paper.pdf

  • Tran VK, Nguyen LM (2017) Semantic refinement gru-based neural language generation for spoken dialogue systems. In: International Conference of the Pacific Association for Computational Linguistics, Springer, pp 63–75

  • Tu G, Wen J, Liu C, Jiang D, Cambria E (2022) Context-and sentiment-aware networks for emotion recognition in conversation. IEEE Trans Artif Intell

  • Tur G, Hakkani-Tür D, Heck L (2010) What is left to be understood in atis? In: 2010 IEEE spoken language technology workshop, IEEE, pp 19–24

  • Tur G, Deng L, Hakkani-Tür D, He X (2012) Towards deeper understanding: deep convex networks for semantic utterance classification. In: 2012 IEEE international conference on acoustics, speech and signal processing (ICASSP), IEEE, pp 5045–5048

  • Ultes S, Rojas-Barahona LM, Su PH, Vandyke D, Kim D, Casanueva I, Budzianowski P, Mrkšić N, Wen TH, Gašić M, Young S (2017) PyDial: A multi-domain statistical dialogue system toolkit. In: Proceedings of ACL 2017, system demonstrations, association for computational linguistics, Vancouver, Canada, pp 73–78, https://aclanthology.org/P17-4013

  • Urbanek J, Fan A, Karamcheti S, Jain S, Humeau S, Dinan E, Rocktäschel T, Kiela D, Szlam A, Weston J (2019) Learning to speak and act in a fantasy text adventure game. In: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP), association for computational linguistics, Hong Kong, China, pp 673–683, https://doi.org/10.18653/v1/D19-1062

  • Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. In: Guyon I, von Luxburg U, Bengio S, Wallach HM, Fergus R, Vishwanathan SVN, Garnett R (eds) Advances in neural information processing systems 30: annual conference on neural information processing systems 2017, December 4–9, 2017, Long Beach, CA, USA, pp 5998–6008, https://proceedings.neurips.cc/paper/2017/hash/3f5ee243547dee91fbd053c1c4a845aa-Abstract.html

  • Vijayakumar AK, Cogswell M, Selvaraju RR, Sun Q, Lee S, Crandall D, Batra D (2016) Diverse beam search: decoding diverse solutions from neural sequence models

  • Vinyals O, Le Q (2015) A neural conversational model. https://arxiv.org/abs/1506.05869

  • Vinyals O, Fortunato M, Jaitly N (2015) Pointer networks. In: Cortes C, Lawrence ND, Lee DD, Sugiyama M, Garnett R (eds) Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, December 7-12, 2015, Montreal, Quebec, Canada, pp 2692–2700, https://proceedings.neurips.cc/paper/2015/hash/29921001f2f04bd3baee84a12e98098f-Abstract.html

  • Viterbi A (1967) Error bounds for convolutional codes and an asymptotically optimum decoding algorithm. IEEE Trans Inf Theory 13(2):260–269

    Article  MATH  Google Scholar 

  • Vougiouklis P, Hare J, Simperl E (2016) A neural network approach for knowledge-driven response generation. In: Proceedings of COLING 2016, the 26th international conference on computational linguistics: technical papers, The COLING 2016 Organizing Committee, Osaka, Japan, pp 3370–3380, https://aclanthology.org/C16-1318

  • de Vries H, Strub F, Chandar S, Pietquin O, Larochelle H, Courville AC (2017) Guesswhat?! visual object discovery through multi-modal dialogue. In: 2017 IEEE conference on computer vision and pattern recognition, CVPR 2017, Honolulu, HI, USA, July 21–26, 2017, IEEE Computer Society, pp 4466–4475, https://doi.org/10.1109/CVPR.2017.475

  • Walker MA, Litman DJ, Kamm CA, Abella A (1997) PARADISE: A framework for evaluating spoken dialogue agents. In: 35th annual meeting of the association for computational linguistics and 8th conference of the European Chapter of the Association for Computational Linguistics, Association for Computational Linguistics, Madrid, Spain, pp 271–280, https://doi.org/10.3115/976909.979652

  • Wan M, McAuley J (2016) Modeling ambiguity, subjectivity, and diverging viewpoints in opinion question answering systems. In: 2016 IEEE 16th international conference on data mining (ICDM), IEEE, pp 489–498

  • Wang H, Peng B, Wong KF (2020a) Learning efficient dialogue policy from demonstrations through shaping. In: Proceedings of the 58th annual meeting of the association for computational linguistics, association for computational linguistics, online, pp 6355–6365, https://doi.org/10.18653/v1/2020.acl-main.566

  • Wang K, Tian J, Wang R, Quan X, Yu J (2020b) Multi-domain dialogue acts and response co-generation. In: Proceedings of the 58th annual meeting of the association for computational linguistics, association for computational linguistics, online, pp 7125–7134, https://doi.org/10.18653/v1/2020.acl-main.638

  • Wang L, Li J, Zeng X, Zhang H, Wong KF (2020c) Continuity of topic, interaction, and query: Learning to quote in online conversations. In: Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP), association for computational linguistics, online, pp 6640–6650, https://doi.org/10.18653/v1/2020.emnlp-main.538

  • Wang S, Zhou K, Lai K, Shen J (2020d) Task-completion dialogue policy learning via Monte Carlo tree search with dueling network. In: Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP), association for computational linguistics, online, pp 3461–3471, https://doi.org/10.18653/v1/2020.emnlp-main.278

  • Wang W, Zhang J, Li Q, Hwang MY, Zong C, Li Z (2019a) Incremental learning from scratch for task-oriented dialogue systems. In: Proceedings of the 57th annual meeting of the association for computational linguistics, association for computational linguistics, Florence, Italy, pp 3710–3720, https://doi.org/10.18653/v1/P19-1361

  • Wang X, Yuan C (2016) Recent advances on human-computer dialogue. CAAI Trans Intell Technol 1(4):303–312

    Article  Google Scholar 

  • Wang X, Shi W, Kim R, Oh Y, Yang S, Zhang J, Yu Z (2019b) Persuasion for good: Towards a personalized persuasive dialogue system for social good. In: Proceedings of the 57th annual meeting of the association for computational linguistics, association for computational linguistics, Florence, Italy, pp 5635–5649, https://doi.org/10.18653/v1/P19-1566

  • Wang Y, Shen Y, Jin H (2018) A bi-model based RNN semantic frame parsing model for intent detection and slot filling. In: Proceedings of the 2018 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 2 (short papers), association for computational linguistics, New Orleans, Louisiana, pp 309–314, https://doi.org/10.18653/v1/N18-2050

  • Wang Y, Guo Y, Zhu S (2020e) Slot attention with value normalization for multi-domain dialogue state tracking. In: Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP), association for computational linguistics, online, pp 3019–3028, https://doi.org/10.18653/v1/2020.emnlp-main.243

  • Wang Y, Joty S, Lyu M, King I, Xiong C, Hoi SC (2020f) VD-BERT: A Unified Vision and Dialog Transformer with BERT. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Association for Computational Linguistics, Online, pp 3325–3338, https://doi.org/10.18653/v1/2020.emnlp-main.269

  • Wang Z, Zhang J, Feng J, Chen Z (2014) Knowledge graph embedding by translating on hyperplanes. In: Brodley CE, Stone P (eds) Proceedings of the twenty-eighth AAAI conference on artificial intelligence, July 27–31, 2014, Québec City, Québec, Canada, AAAI Press, pp 1112–1119, http://www.aaai.org/ocs/index.php/AAAI/AAAI14/paper/view/8531

  • Wang Z, Schaul T, Hessel M, van Hasselt H, Lanctot M, de Freitas N (2016) Dueling network architectures for deep reinforcement learning. In: Balcan M, Weinberger KQ (eds) Proceedings of the 33nd international conference on machine learning, ICML 2016, New York City, NY, USA, June 19–24, 2016, JMLR.org, JMLR Workshop and Conference Proceedings, vol 48, pp 1995–2003, http://proceedings.mlr.press/v48/wangf16.html

  • Wang Z, Ho S, Cambria E (2020) A review of emotion sensing: Categorization models and algorithms. Multimedia Tools Appl 79:35553–35582

    Article  Google Scholar 

  • Welleck S, Weston J, Szlam A, Cho K (2019) Dialogue natural language inference. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, Florence, Italy, pp 3731–3741, https://doi.org/10.18653/v1/P19-1363

  • Wen TH, Gašić M, Kim D, Mrkšić N, Su PH, Vandyke D, Young S (2015a) Stochastic language generation in dialogue using recurrent neural networks with convolutional sentence reranking. In: Proceedings of the 16th annual meeting of the special interest group on discourse and dialogue, association for computational linguistics, Prague, Czech Republic, pp 275–284, https://doi.org/10.18653/v1/W15-4639

  • Wen TH, Gašić M, Mrkšić N, Su PH, Vandyke D, Young S (2015b) Semantically conditioned LSTM-based natural language generation for spoken dialogue systems. In: Proceedings of the 2015 conference on empirical methods in natural language processing, association for computational linguistics, Lisbon, Portugal, pp 1711–1721, https://doi.org/10.18653/v1/D15-1199

  • Wen TH, Gašić M, Mrkšić N, Rojas-Barahona LM, Su PH, Ultes S, Vandyke D, Young S (2016a) Conditional generation and snapshot learning in neural dialogue systems. In: Proceedings of the 2016 conference on empirical methods in natural language processing, association for computational linguistics, Austin, Texas, pp 2153–2162, https://doi.org/10.18653/v1/D16-1233

  • Wen TH, Gašić M, Mrkšić N, Rojas-Barahona LM, Su PH, Vandyke D, Young S (2016b) Multi-domain neural network language generation for spoken dialogue systems. In: Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies, association for computational linguistics, San Diego, California, pp 120–129, https://doi.org/10.18653/v1/N16-1015

  • Wen TH, Vandyke D, Mrkšić N, Gašić M, Rojas-Barahona LM, Su PH, Ultes S, Young S (2017) A network-based end-to-end trainable task-oriented dialogue system. In: Proceedings of the 15th conference of the European chapter of the association for computational linguistics: volume 1, long papers, association for computational linguistics, Valencia, Spain, pp 438–449, https://aclanthology.org/E17-1042

  • Weston J, Chopra S, Bordes A (2015) Memory networks. In: Bengio Y, LeCun Y (eds) 3rd international conference on learning representations, ICLR 2015, San Diego, CA, USA, May 7–9, 2015, conference track proceedings

  • Williams J (2013) Multi-domain learning and generalization in dialog state tracking. In: Proceedings of the SIGDIAL 2013 conference, association for computational linguistics, Metz, France, pp 433–441, https://aclanthology.org/W13-4068

  • Williams J, Raux A, Ramachandran D, Black A (2013) The dialog state tracking challenge. In: Proceedings of the SIGDIAL 2013 conference, association for computational linguistics, Metz, France, pp 404–413, https://aclanthology.org/W13-4065

  • Williams JD (2007) Partially observable markov decision processes for spoken dialogue management. PhD thesis, University of Cambridge

  • Williams JD (2014) Web-style ranking and SLU combination for dialog state tracking. In: Proceedings of the 15th annual meeting of the special interest group on discourse and dialogue (SIGDIAL), association for computational linguistics, Philadelphia, PA, U.S.A., pp 282–291, https://doi.org/10.3115/v1/W14-4339

  • Williams JD, Zweig G (2016) End-to-end lstm-based dialog control optimized with supervised and reinforcement learning. https://arxiv.org/abs/1606.01269

  • Williams JD, Asadi K, Zweig G (2017) Hybrid code networks: practical and efficient end-to-end dialog control with supervised and reinforcement learning. In: Proceedings of the 55th annual meeting of the association for computational linguistics (volume 1: long papers), association for computational linguistics, Vancouver, Canada, pp 665–677, https://doi.org/10.18653/v1/P17-1062

  • Williams RJ (1992) Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach Learn 8(3–4):229–256

    Article  MATH  Google Scholar 

  • Williams RJ, Zipser D (1989) A learning algorithm for continually running fully recurrent neural networks. Neural Comput 1(2):270–280

    Article  Google Scholar 

  • Wiseman S, Shieber S, Rush A (2017) Challenges in data-to-document generation. In: Proceedings of the 2017 conference on empirical methods in natural language processing, association for computational linguistics, Copenhagen, Denmark, pp 2253–2263, https://doi.org/10.18653/v1/D17-1239

  • Wu CS, Xiong C (2020) Probing task-oriented dialogue representation from language models. In: Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP), association for computational linguistics, online, pp 5036–5051, https://doi.org/10.18653/v1/2020.emnlp-main.409

  • Wu CS, Madotto A, Hosseini-Asl E, Xiong C, Socher R, Fung P (2019a) Transferable multi-domain state generator for task-oriented dialogue systems. In: Proceedings of the 57th annual meeting of the association for computational linguistics, association for computational linguistics, Florence, Italy, pp 808–819, https://doi.org/10.18653/v1/P19-1078

  • Wu CS, Hoi S, Socher R, Xiong C (2020a) Tod-bert: Pre-trained natural language understanding for task-oriented dialogues. abs/2004.06871, https://arxiv.org/abs/2004.06871

  • Wu J, Wang X, Wang WY (2019b) Self-supervised dialogue learning. In: Proceedings of the 57th annual meeting of the association for computational linguistics, association for computational linguistics, Florence, Italy, pp 3857–3867, https://doi.org/10.18653/v1/P19-1375

  • Wu S, Li Y, Zhang D, Zhou Y, Wu Z (2020b) Diverse and informative dialogue generation with context-specific commonsense knowledge awareness. In: Proceedings of the 58th annual meeting of the association for computational linguistics, association for computational linguistics, online, pp 5811–5820, https://doi.org/10.18653/v1/2020.acl-main.515

  • Wu W, Guo Z, Zhou X, Wu H, Zhang X, Lian R, Wang H (2019c) Proactive human-machine conversation with explicit conversation goals. https://arxiv.org/abs/1906.05572

  • Wu Y, Wu W, Xing C, Zhou M, Li Z (2017) Sequential matching network: A new architecture for multi-turn response selection in retrieval-based chatbots. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Association for Computational Linguistics, Vancouver, Canada, pp 496–505, https://doi.org/10.18653/v1/P17-1046

  • Wu Z, Galley M, Brockett C, Zhang Y, Gao X, Quirk C, Koncel-Kedziorski R, Gao J, Hajishirzi H, Ostendorf M, et al. (2020c) A controllable model of grounded response generation. https://arxiv.org/abs/2005.00613

  • Xiao H, Huang M, Hao Y, Zhu X (2015) Transg: A generative mixture model for knowledge graph embedding. abs/1509.05488, https://arxiv.org/abs/1509.05488

  • Xiao H, Huang M, Meng L, Zhu X (2017) SSP: semantic space projection for knowledge graph embedding with text descriptions. In: Singh SP, Markovitch S (eds) Proceedings of the thirty-first AAAI conference on artificial intelligence, February 4–9, 2017, San Francisco, California, USA, AAAI Press, pp 3104–3110, http://aaai.org/ocs/index.php/AAAI/AAAI17/paper/view/14306

  • Xie R, Liu Z, Jia J, Luan H, Sun M (2016) Representation learning of knowledge graphs with entity descriptions. In: Schuurmans D, Wellman MP (eds) Proceedings of the Thirtieth AAAI conference on artificial intelligence, February 12–17, 2016, Phoenix, Arizona, USA, AAAI Press, pp 2659–2665, http://www.aaai.org/ocs/index.php/AAAI/AAAI16/paper/view/12216

  • Xing C, Wu W, Wu Y, Liu J, Huang Y, Zhou M, Ma W (2017) Topic aware neural response generation. In: Singh SP, Markovitch S (eds) Proceedings of the Thirty-First AAAI conference on artificial intelligence, February 4–9, 2017, San Francisco, California, USA, AAAI Press, pp 3351–3357, http://aaai.org/ocs/index.php/AAAI/AAAI17/paper/view/14563

  • Xing C, Wu Y, Wu W, Huang Y, Zhou M (2018) Hierarchical recurrent attention network for response generation. In: McIlraith SA, Weinberger KQ (eds) Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, (AAAI-18), the 30th innovative Applications of Artificial Intelligence (IAAI-18), and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI-18), New Orleans, Louisiana, USA, February 2–7, 2018, AAAI Press, pp 5610–5617, https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/16510

  • Xu C, Wu W, Tao C, Hu H, Schuerman M, Wang Y (2019) Neural response generation with meta-words. In: Proceedings of the 57th annual meeting of the association for computational linguistics, association for computational linguistics, Florence, Italy, pp 5416–5426, https://doi.org/10.18653/v1/P19-1538

  • Xu J, Wang H, Niu ZY, Wu H, Che W, Liu T (2020a) Conversational graph grounded policy learning for open-domain conversation generation. In: Proceedings of the 58th annual meeting of the association for computational linguistics, association for computational linguistics, online, pp 1835–1845, https://doi.org/10.18653/v1/2020.acl-main.166

  • Xu K, Tan H, Song L, Wu H, Zhang H, Song L, Yu D (2020b) Semantic Role Labeling Guided Multi-turn Dialogue ReWriter. In: Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP), association for computational linguistics, online, pp 6632–6639, https://doi.org/10.18653/v1/2020.emnlp-main.537

  • Yadollahi A, Shahraki AG, Zaiane OR (2017) Current state of text sentiment analysis from opinion to emotion mining. ACM Comput Surv (CSUR) 50(2):1–33

    Article  Google Scholar 

  • Yang S, Zhang R, Erfani S (2020) GraphDialog: Integrating graph knowledge into end-to-end task-oriented dialogue systems. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Association for Computational Linguistics, Online, pp 1878–1888, https://doi.org/10.18653/v1/2020.emnlp-main.147

  • Yann D, Tur G, Hakkani-Tur D, Heck L (2014) Zero-shot learning and clustering for semantic utterance classification using deep learning. In: International conference on learning representations (cited on page 28)

  • Yao K, Zweig G, Hwang MY, Shi Y, Yu D (2013) Recurrent neural networks for language understanding. In: Interspeech, pp 2524–2528

  • Yao K, Peng B, Zhang Y, Yu D, Zweig G, Shi Y (2014) Spoken language understanding using long short-term memory neural networks. In: 2014 IEEE spoken language technology workshop (SLT), IEEE, pp 189–194

  • Yao K, Peng B, Zweig G, Wong KF (2016) An attentional neural conversation model with improved specificity. urlhttps://arxiv.org/abs/1606.01292

  • Yih Wt, He X, Gao J (2015) Deep learning and continuous representations for natural language processing. In: Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Tutorial Abstracts, Association for Computational Linguistics, Denver, Colorado, pp 6–8, https://doi.org/10.3115/v1/N15-4004

  • Yin J, Jiang X, Lu Z, Shang L, Li H, Li X (2016) Neural generative question answering. In: Kambhampati S (ed) Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, IJCAI 2016, New York, NY, USA, 9-15 July 2016, IJCAI/AAAI Press, pp 2972–2978, http://www.ijcai.org/Abstract/16/422

  • Yoshino K, Hori C, Perez J, D’Haro LF, Polymenakos L, Gunasekara C, Lasecki WS, Kummerfeld J, Galley M, Brockett C, et al. (2018) The 7th dialog system technology challenge

  • Young S, Gašić M, Keizer S, Mairesse F, Schatzmann J, Thomson B, Yu K (2010) The hidden information state model: a practical framework for pomdp-based spoken dialogue management. Comput Speech Lang 24(2):150–174

    Article  Google Scholar 

  • Young T, Cambria E, Chaturvedi I, Zhou H, Biswas S, Huang M (2018) Augmenting end-to-end dialogue systems with commonsense knowledge. In: McIlraith SA, Weinberger KQ (eds) Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, (AAAI-18), the 30th innovative Applications of Artificial Intelligence (IAAI-18), and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI-18), New Orleans, Louisiana, USA, February 2–7, 2018, AAAI Press, pp 4970–4977, https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/16573

  • Young T, Pandelea V, Poria S, Cambria E (2020) Dialogue systems with audio context. Neurocomputing 388:102–109

    Article  Google Scholar 

  • Young T, Xing F, Pandelea V, Ni J, Cambria E (2022) Fusing task-oriented and open-domain dialogues in conversational agents. Proceedings of the AAAI Conference on Artificial Intelligence 36:11622–11629

    Article  Google Scholar 

  • Yu F, Tang J, Yin W, Sun Y, Tian H, Wu H, Wang H (2020) Ernie-vil: Knowledge enhanced vision-language representations through scene graph. https://arxiv.org/abs/2006.16934

  • Yu T, Joty S (2020) Online conversation disentanglement with pointer networks. In: Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP), association for computational linguistics, online, pp 6321–6330, https://doi.org/10.18653/v1/2020.emnlp-main.512

  • Zaheer M, Guruganesh G, Dubey KA, Ainslie J, Alberti C, Ontañón S, Pham P, Ravula A, Wang Q, Yang L, Ahmed A (2020) Big bird: Transformers for longer sequences. In: Larochelle H, Ranzato M, Hadsell R, Balcan M, Lin H (eds) Advances in neural information processing systems 33: annual conference on neural information processing systems 2020, NeurIPS 2020, December 6–12, 2020, virtual, https://proceedings.neurips.cc/paper/2020/hash/c8512d142a2d849725f31a9a7a361ab9-Abstract.html

  • Zahiri SM, Choi JD (2017) Emotion detection on tv show transcripts with sequence-based convolutional neural networks. https://arxiv.org/abs/1708.04299

  • Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional networks. In: European conference on computer vision, Springer, pp 818–833

  • Zhang C, Li Y, Du N, Fan W, Yu P (2019a) Joint slot filling and intent detection via capsule neural networks. In: Proceedings of the 57th annual meeting of the association for computational linguistics, association for computational linguistics, Florence, Italy, pp 5259–5267, https://doi.org/10.18653/v1/P19-1519

  • Zhang C, Li Y, Du N, Fan W, Yu P (2019b) Joint slot filling and intent detection via capsule neural networks. In: Proceedings of the 57th annual meeting of the association for computational linguistics, association for computational linguistics, Florence, Italy, pp 5259–5267, https://doi.org/10.18653/v1/P19-1519

  • Zhang H, Lan Y, Pang L, Guo J, Cheng X (2019c) ReCoSa: detecting the relevant contexts with self-attention for multi-turn dialogue generation. In: Proceedings of the 57th annual meeting of the association for computational linguistics, association for computational linguistics, Florence, Italy, pp 3721–3730, https://doi.org/10.18653/v1/P19-1362

  • Zhang H, Liu Z, Xiong C, Liu Z (2020a) Grounded conversation generation as guided traverses in commonsense knowledge graphs. In: Proceedings of the 58th annual meeting of the association for computational linguistics, association for computational linguistics, online, pp 2031–2043, https://doi.org/10.18653/v1/2020.acl-main.184

  • Zhang J, Danescu-Niculescu-Mizil C (2020) Balancing objectives in counseling conversations: Advancing forwards or looking backwards. In: Proceedings of the 58th annual meeting of the association for computational linguistics, association for computational linguistics, online, pp 5276–5289, https://doi.org/10.18653/v1/2020.acl-main.470

  • Zhang S, Dinan E, Urbanek J, Szlam A, Kiela D, Weston J (2018a) Personalizing dialogue agents: I have a dog, do you have pets too? In: Proceedings of the 56th annual meeting of the association for computational linguistics (volume 1: long papers), association for computational linguistics, Melbourne, Australia, pp 2204–2213, https://doi.org/10.18653/v1/P18-1205

  • Zhang Y, Wallace B (2017) A sensitivity analysis of (and practitioners’ guide to) convolutional neural networks for sentence classification. In: Proceedings of the eighth international joint conference on natural language processing (volume 1: long papers), Asian Federation of natural language processing, Taipei, Taiwan, pp 253–263, https://aclanthology.org/I17-1026

  • Zhang Y, Galley M, Gao J, Gan Z, Li X, Brockett C, Dolan B (2018b) Generating informative and diverse conversational responses via adversarial information maximization. In: Bengio S, Wallach HM, Larochelle H, Grauman K, Cesa-Bianchi N, Garnett R (eds) Advances in neural information processing systems 31: annual conference on neural information processing systems 2018, NeurIPS 2018, December 3–8, 2018, Montréal, Canada, pp 1815–1825, https://proceedings.neurips.cc/paper/2018/hash/23ce1851341ec1fa9e0c259de10bf87c-Abstract.html

  • Zhang Y, Ou Z, Hu M, Feng J (2020b) A probabilistic end-to-end task-oriented dialog model with latent belief states towards semi-supervised learning. In: Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP), association for computational linguistics, online, pp 9207–9219, https://doi.org/10.18653/v1/2020.emnlp-main.740

  • Zhang Z, Li J, Zhu P, Zhao H, Liu G (2018c) Modeling multi-turn conversation with deep utterance aggregation. In: Proceedings of the 27th international conference on computational linguistics, association for computational linguistics, Santa Fe, New Mexico, USA, pp 3740–3752, https://aclanthology.org/C18-1317

  • Zhang Z, Li X, Gao J, Chen E (2019d) Budgeted policy learning for task-oriented dialogue systems. In: Proceedings of the 57th annual meeting of the association for computational linguistics, association for computational linguistics, Florence, Italy, pp 3742–3751, https://doi.org/10.18653/v1/P19-1364

  • Zhao T, Eskenazi M (2016) Towards end-to-end learning for dialog state tracking and management using deep reinforcement learning. In: Proceedings of the 17th annual meeting of the special interest group on discourse and dialogue, association for computational linguistics, Los Angeles, pp 1–10, https://doi.org/10.18653/v1/W16-3601

  • Zhao T, Eskenazi M (2018) Zero-shot dialog generation with cross-domain latent actions. In: Proceedings of the 19th annual sigdial meeting on discourse and dialogue, association for computational linguistics, Melbourne, Australia, pp 1–10, https://doi.org/10.18653/v1/W18-5001

  • Zhao T, Lee K, Eskenazi M (2018) Unsupervised discrete sentence representation learning for interpretable neural dialog generation. In: Proceedings of the 56th annual meeting of the association for computational linguistics (volume 1: long papers), association for computational linguistics, Melbourne, Australia, pp 1098–1107, https://doi.org/10.18653/v1/P18-1101

  • Zhao T, Lala D, Kawahara T (2020a) Designing precise and robust dialogue response evaluators. In: Proceedings of the 58th annual meeting of the association for computational linguistics, association for computational linguistics, online, pp 26–33, https://doi.org/10.18653/v1/2020.acl-main.4

  • Zhao X, Wu W, Xu C, Tao C, Zhao D, Yan R (2020b) Knowledge-grounded dialogue generation with pre-trained language models. In: Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP), association for computational linguistics, online, pp 3377–3390, https://doi.org/10.18653/v1/2020.emnlp-main.272

  • Zhong P, Zhang C, Wang H, Liu Y, Miao C (2020) Towards persona-based empathetic conversational models. In: Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP), association for computational linguistics, online, pp 6556–6566, https://doi.org/10.18653/v1/2020.emnlp-main.531

  • Zhou H, Huang M, Zhu X (2016) Context-aware natural language generation for spoken dialogue systems. In: Proceedings of COLING 2016, the 26th international conference on computational linguistics: technical papers, The COLING 2016 Organizing Committee, Osaka, Japan, pp 2032–2041, https://aclanthology.org/C16-1191

  • Zhou H, Zheng C, Huang K, Huang M, Zhu X (2020a) KdConv: A Chinese multi-domain dialogue dataset towards multi-turn knowledge-driven conversation. In: Proceedings of the 58th annual meeting of the association for computational linguistics, association for computational linguistics, online, pp 7098–7108, https://doi.org/10.18653/v1/2020.acl-main.635

  • Zhou K, Prabhumoye S, Black AW (2018) A dataset for document grounded conversations. In: Proceedings of the 2018 conference on empirical methods in natural language processing, association for computational linguistics, Brussels, Belgium, pp 708–713, https://doi.org/10.18653/v1/D18-1076

  • Zhou L, Palangi H, Zhang L, Hu H, Corso JJ, Gao J (2020b) Unified vision-language pre-training for image captioning and VQA. In: The Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2020, The Thirty-Second Innovative Applications of Artificial Intelligence Conference, IAAI 2020, The Tenth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2020, New York, NY, USA, February 7–12, 2020, AAAI Press, pp 13041–13049, https://aaai.org/ojs/index.php/AAAI/article/view/7005

  • Zhou X, Wang WY (2018) MojiTalk: Generating emotional responses at scale. In: Proceedings of the 56th annual meeting of the association for computational linguistics (volume 1: long papers), association for computational linguistics, Melbourne, Australia, pp 1128–1137, https://doi.org/10.18653/v1/P18-1104

  • Zhu Q, Cui L, Zhang WN, Wei F, Liu T (2019) Retrieval-enhanced adversarial training for neural response generation. In: Proceedings of the 57th annual meeting of the association for computational linguistics, association for computational linguistics, Florence, Italy, pp 3763–3773, https://doi.org/10.18653/v1/P19-1366

  • Zhu Q, Zhang WN, Liu T, Wang WY (2020) Counterfactual off-policy training for neural dialogue generation. In: Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP), association for computational linguistics, online, pp 3438–3448, https://doi.org/10.18653/v1/2020.emnlp-main.276

Download references

Acknowledgements

This research/project is supported by A*STAR under its Industry Alignment Fund (LOA Award I1901E0046).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Erik Cambria.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The frameworks, topics, and datasets discussed are originated from the extensive literature review of state-of-the-art research. We have tried our best to cover all but may still omit some works. Readers are welcome to provide suggestions regarding the omissions and mistakes in this article. We also intend to update this article with time as and when new approaches or definitions are proposed and used by the community.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ni, J., Young, T., Pandelea, V. et al. Recent advances in deep learning based dialogue systems: a systematic survey. Artif Intell Rev 56, 3055–3155 (2023). https://doi.org/10.1007/s10462-022-10248-8

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10462-022-10248-8

Keywords

Navigation