Abstract
This paper presents a system to detect multiple intents (MIs) in an input sentence when only single-intent (SI)-labeled training data are available. To solve the problem, this paper categorizes input sentences into three types and uses a two-stage approach in which each stage attempts to detect MIs in different types of sentences. In the first stage, the system generates MI hypotheses based on conjunctions in the input sentence, then evaluates the hypotheses and then selects the best one that satisfies specified conditions. In the second stage, the system applies sequence labeling to mark intents on the input sentence. The sequence labeling model is trained based on SI-labeled training data. In experiments, the proposed two-stage MI detection method reduced errors for written and spoken input by 20.54 and 17.34 % respectively.
Similar content being viewed by others
References
Elmir Y, Elberrichi Z, Adjoudj R (2014) Multimodal biometric using a hierarchical fusion of a Person’s face, voice, and online signature. J Inf Process Syst 10:555–567. doi:10.3745/JIPS.02.0007
Hakkani-Tur D, Tur G, Heck L, Fidler A (2012) A discriminative classification-based approach to information state updates for a multi-domain dialog System. in Proc. Interspeech
Han D, Choi K (2007) A study on error correction using phoneme similarity in post-processing of speech recognition. in The Journal of The Korea Institute of Intelligent Transport Systems. The Korean Institute of Intelligent Transport Systems (Korean ITS, p 77–86
Lafferty JD, McCallum A, Pereira FCN (2001) Conditional random fields: probabilistic models for segmenting and labeling sequence data. in Proc. ICML
Lee C, Jung S, Kim K, Lee D, Lee GG (2010) Recent approaches to dialog management for spoken dialog systems. J Comput Sci Eng 4(1):1–22
Lee C, Jung S, Kim K, Lee GG (2010) Hybrid approach to robust dialog management using agenda and dialog examples. Comput Speech Lang 24(4):609–631
Liu J, Li X, Acero A, Wang Y (2011) Lexicon modeling for query understanding. in Proc. ICASSP
Liu J, Pasupat P, Wang Y, Cyphers S, Glass J (2013) Query understanding enhanced by hierarchical parsing structure. in Proc. ASRU
Mikolov T, Karafi’at M, Burget L, Cernock’y J, Khudanpur S (2010) Recurrent neural network based language model. in INTERSPEECH, p 1045–1048
Noh H, Ryu S, Lee D, Lee K, Lee C, Lee GG (2012) An example-based approach to ranking multiple dialog states for flexible dialog management. IEEE J Sel Top Sign Process 6(8):943–958
O’Neill I, Hanna P, Liu X, Greer D, McTear M (2005) Implementing advanced spoken dialogue management in Java. Sci Comput Program 54(1):99–124
Ram VS, Devi SL (2008) Clause boundary identification using conditional random fields. in Proc. CICLing
Ratnaparkhi A, Marcus MP (1998) Maximum entropy models for natural language ambiguity resolution. Ph. D. Thesis, UPenn
Roark B, Liu Y, Harper M, Stewart R, Lease M, Snover M (2006) Reranking for sentence boundary detection in conversational speech. in Proc. ICASSP
Ronald R (1996) A maximum entropy approach to adaptive statistical language modeling. Comput Speech Lang 10:187–228
Sang EFTK, Déjean H (2001) Introduction to the CoNLL-2001 shared task: clause identification. in Proc. CoNLL
Seo H (2013) Multiple user intent understanding for spoken dialog system. MS Thesis, POSTECH
Vanus J, Smolon M, Martinek R, Koziorek J, Zidek J, Bilik P (2015) Testing of the voice communication in smart home care. Human-centric Comput Inf Sci 5:15. doi:10.1186/s13673-015-0035-0
Verma P, Singh R, Singh AK (2013) A framework to integrate speech based interface for blind web users on the websites of public interest. Human-centric Comput Inf Sci 3:21. doi:10.1186/2192-1962-3-21
Walker D, Clements D, Darwin M, Amtrup J (2001) Sentence boundary detection: a comparison of paradigms for improving MT quality. in Proc. MT Summit
Williams JD, Young S (2007) Scaling POMDPs for spoken dialog management. IEEE Trans Audio Speech Lang Process 15(7):2116–2129
Wu J (2002) Maximum Entropy Language Modeling with Non-Local dependencies. Ph.D. thesis, Johns Hopkins University
Xy P, Sarikaya R (2013) Exploiting shared information for multi-intent natural language sentence classification. in Proc. Interspeech
Yang Z, Levow G, Meng H (2012) Predicting user satisfaction in spoken dialog system evaluation with collaborative filtering. IEEE J Sel Top Sign Process 6(8):971–981
Acknowledgments
This paper is supported by ATC(Advanced Technology Center) Program - “Development of Conversational Q&A Search Framework Based On Linked Data: Project No. 10048448”
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Kim, B., Ryu, S. & Lee, G.G. Two-stage multi-intent detection for spoken language understanding. Multimed Tools Appl 76, 11377–11390 (2017). https://doi.org/10.1007/s11042-016-3724-4
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-016-3724-4