Skip to main content
Log in

Two-stage multi-intent detection for spoken language understanding

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

This paper presents a system to detect multiple intents (MIs) in an input sentence when only single-intent (SI)-labeled training data are available. To solve the problem, this paper categorizes input sentences into three types and uses a two-stage approach in which each stage attempts to detect MIs in different types of sentences. In the first stage, the system generates MI hypotheses based on conjunctions in the input sentence, then evaluates the hypotheses and then selects the best one that satisfies specified conditions. In the second stage, the system applies sequence labeling to mark intents on the input sentence. The sequence labeling model is trained based on SI-labeled training data. In experiments, the proposed two-stage MI detection method reduced errors for written and spoken input by 20.54 and 17.34 % respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Notes

  1. http://itl.nist.gov/iad/mig/tests/rt/2004-fall/

References

  1. Elmir Y, Elberrichi Z, Adjoudj R (2014) Multimodal biometric using a hierarchical fusion of a Person’s face, voice, and online signature. J Inf Process Syst 10:555–567. doi:10.3745/JIPS.02.0007

    Article  Google Scholar 

  2. Hakkani-Tur D, Tur G, Heck L, Fidler A (2012) A discriminative classification-based approach to information state updates for a multi-domain dialog System. in Proc. Interspeech

  3. Han D, Choi K (2007) A study on error correction using phoneme similarity in post-processing of speech recognition. in The Journal of The Korea Institute of Intelligent Transport Systems. The Korean Institute of Intelligent Transport Systems (Korean ITS, p 77–86

  4. Lafferty JD, McCallum A, Pereira FCN (2001) Conditional random fields: probabilistic models for segmenting and labeling sequence data. in Proc. ICML

  5. Lee C, Jung S, Kim K, Lee D, Lee GG (2010) Recent approaches to dialog management for spoken dialog systems. J Comput Sci Eng 4(1):1–22

    Article  Google Scholar 

  6. Lee C, Jung S, Kim K, Lee GG (2010) Hybrid approach to robust dialog management using agenda and dialog examples. Comput Speech Lang 24(4):609–631

    Article  Google Scholar 

  7. Liu J, Li X, Acero A, Wang Y (2011) Lexicon modeling for query understanding. in Proc. ICASSP

  8. Liu J, Pasupat P, Wang Y, Cyphers S, Glass J (2013) Query understanding enhanced by hierarchical parsing structure. in Proc. ASRU

  9. Mikolov T, Karafi’at M, Burget L, Cernock’y J, Khudanpur S (2010) Recurrent neural network based language model. in INTERSPEECH, p 1045–1048

  10. Noh H, Ryu S, Lee D, Lee K, Lee C, Lee GG (2012) An example-based approach to ranking multiple dialog states for flexible dialog management. IEEE J Sel Top Sign Process 6(8):943–958

    Article  Google Scholar 

  11. O’Neill I, Hanna P, Liu X, Greer D, McTear M (2005) Implementing advanced spoken dialogue management in Java. Sci Comput Program 54(1):99–124

    Article  MathSciNet  Google Scholar 

  12. Ram VS, Devi SL (2008) Clause boundary identification using conditional random fields. in Proc. CICLing

  13. Ratnaparkhi A, Marcus MP (1998) Maximum entropy models for natural language ambiguity resolution. Ph. D. Thesis, UPenn

  14. Roark B, Liu Y, Harper M, Stewart R, Lease M, Snover M (2006) Reranking for sentence boundary detection in conversational speech. in Proc. ICASSP

  15. Ronald R (1996) A maximum entropy approach to adaptive statistical language modeling. Comput Speech Lang 10:187–228

    Article  Google Scholar 

  16. Sang EFTK, Déjean H (2001) Introduction to the CoNLL-2001 shared task: clause identification. in Proc. CoNLL

  17. Seo H (2013) Multiple user intent understanding for spoken dialog system. MS Thesis, POSTECH

  18. Vanus J, Smolon M, Martinek R, Koziorek J, Zidek J, Bilik P (2015) Testing of the voice communication in smart home care. Human-centric Comput Inf Sci 5:15. doi:10.1186/s13673-015-0035-0

    Article  Google Scholar 

  19. Verma P, Singh R, Singh AK (2013) A framework to integrate speech based interface for blind web users on the websites of public interest. Human-centric Comput Inf Sci 3:21. doi:10.1186/2192-1962-3-21

    Article  Google Scholar 

  20. Walker D, Clements D, Darwin M, Amtrup J (2001) Sentence boundary detection: a comparison of paradigms for improving MT quality. in Proc. MT Summit

  21. Williams JD, Young S (2007) Scaling POMDPs for spoken dialog management. IEEE Trans Audio Speech Lang Process 15(7):2116–2129

    Article  Google Scholar 

  22. Wu J (2002) Maximum Entropy Language Modeling with Non-Local dependencies. Ph.D. thesis, Johns Hopkins University

  23. Xy P, Sarikaya R (2013) Exploiting shared information for multi-intent natural language sentence classification. in Proc. Interspeech

  24. Yang Z, Levow G, Meng H (2012) Predicting user satisfaction in spoken dialog system evaluation with collaborative filtering. IEEE J Sel Top Sign Process 6(8):971–981

    Article  Google Scholar 

Download references

Acknowledgments

This paper is supported by ATC(Advanced Technology Center) Program - “Development of Conversational Q&A Search Framework Based On Linked Data: Project No. 10048448”

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Byeongchang Kim.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kim, B., Ryu, S. & Lee, G.G. Two-stage multi-intent detection for spoken language understanding. Multimed Tools Appl 76, 11377–11390 (2017). https://doi.org/10.1007/s11042-016-3724-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-016-3724-4

Keywords

Navigation