Kernel-Based Discriminative Re-ranking for Spoken Command Understanding in HRI

Basili, Roberto; Bastianelli, Emanuele; Castellucci, Giuseppe; Nardi, Daniele; Perera, Vittorio

doi:10.1007/978-3-319-03524-6_15

Roberto Basili²⁰,
Emanuele Bastianelli²¹,
Giuseppe Castellucci²²,
Daniele Nardi²³ &
…
Vittorio Perera²³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8249))

Included in the following conference series:

Congress of the Italian Association for Artificial Intelligence

1288 Accesses
3 Citations

Abstract

Speech recognition is being addressed as one of the key technologies for a natural interaction with robots, that are targeting in the consumer market. However, speech recognition in human-robot interaction is typically affected by noisy conditions of the operational environment, that impact on the performance of the recognition of spoken commands. Consequently, finite-state grammars or statistical language models even though they can be tailored to the target domain exhibit high rate of false positives or low accuracy. In this paper, a discriminative re-ranking method is applied to a simple speech and language processing cascade, based on off-the-shelf components in realistic conditions. Tree kernels are here applied to improve the accuracy of the recognition process by re-ranking the n-best list returned by the speech recognition component. The rationale behind our approach is to reduce the effort for devising domain dependent solutions in the design of speech interfaces for language processing in human-robot interactions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Baroni, M., Bernardini, S., Ferraresi, A., Zanchetta, E.: The wacky wide web: a collection of very large linguistically processed web-crawled corpora. LRE 43(3), 209–226 (2009)
Google Scholar
Cantrell, R., Scheutz, M., Schermerhorn, P., Wu, X.: Robust spoken instruction understanding for HRI. In: 2010 5th ACM/IEEE International Conference on HRI, pp. 275–282 (March 2010)
Google Scholar
Collins, M., Duffy, N.: New ranking algorithms for parsing and tagging: kernels over discrete structures, and the voted perceptron. In: Proc. of the 40th Annual Meeting on ACL, ACL 2002, pp. 263–270. Association for Computational Linguistics, Stroudsburg (2002), http://dx.doi.org/10.3115/1073083.1073128
Google Scholar
Croce, D., Moschitti, A., Basili, R.: Structured lexical similarity via convolution kernels on dependency trees. In: EMNLP, pp. 1034–1046 (2011)
Google Scholar
Dinarelli, M., Moschitti, A., Riccardi, G.: Discriminative reranking for spoken language understanding. IEEE Transactions on Audio, Speech & Language Processing 20(2), 526–539 (2012)
Google Scholar
Doostdar, M., Schiffer, S., Lakemeyer, G.: A robust speech recognition system for service-robotics applications. In: Iocchi, L., Matsubara, H., Weitzenfeld, A., Zhou, C. (eds.) RoboCup 2008. LNCS, vol. 5399, pp. 1–12. Springer, Heidelberg (2009)
Chapter Google Scholar
Dzifcak, J., Scheutz, M., Baral, C., Schermerhorn, P.: What to do and how to do it: Translating natural language directives into temporal and dynamic logic representation for goal management and action execution. In: Proc. ICRA 2009, Kobe, Japan (May 2009)
Google Scholar
Golub, G., Kahan, W.: Calculating the singular values and pseudo-inverse of a matrix. J. Soc. Ind. Appl. Math.: Series B, Numerical Analysis (1965)
Google Scholar
Huang, A.S., Tellex, S., Bachrach, A., Kollar, T., Roy, D., Roy, N.: Natural language command of an autonomous micro-air vehicle. In: IROS, Taipei, Taiwan, pp. 2663–2669 (October 2010)
Google Scholar
Special Issue: Dialogue with Robots, vol. 34(2) (2011)
Google Scholar
Klein, D., Manning, C.D.: Accurate unlexicalized parsing. In: Proc. of ACL 2003, pp. 423–430 (2003)
Google Scholar
Kollar, T., Tellex, S., Roy, D., Roy, N.: Toward understanding natural language directions. In: 2010 5th ACM/IEEE International Conference on HRI (2010)
Google Scholar
Landauer, T., Dumais, S.: A solution to plato’s problem: The latent semantic analysis theory of acquisition, induction and representation of knowledge. Psychological Review 104 (1997)
Google Scholar
Morbini, F., Audhkhasi, K., Artstein, R., Segbroeck, M.V., Sagae, K., Georgiou, P.G., Traum, D.R., Narayanan, S.S.: A reranking approach for recognition and classification of speech input in conversational dialogue systems. In: SLT, pp. 49–54 (2012)
Google Scholar
Moschitti, A.: Efficient convolution kernels for dependency and constituent syntactic trees. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 318–329. Springer, Heidelberg (2006)
Chapter Google Scholar
Shawe-Taylor, J., Cristianini, N.: Kernel Methods for Pattern Analysis. Cambridge University Press (2004)
Google Scholar
Shen, L., Sarkar, A., Och, F.: Discriminative reranking for machine translation. In: Proc. of HLT-NAACL 2004, pp. 177–184 (2004)
Google Scholar
Shen, L., Joshi, A.K.: An svm based voting algorithm with application to parse reranking. In: Proc. of HLT-NAACL 2003, CONLL 2003, vol. 4, pp. 9–16. ACL, Stroudsburg (2003)
Google Scholar
Taylor, M., Zaragoza, H., Craswell, N., Robertson, S., Burges, C.: Optimisation methods for ranking functions with multiple parameters. In: Proc. of the 15th ACM International CIKM, CIKM 2006, pp. 585–593. ACM, New York (2006)
Google Scholar
Tellex, S., Kollar, T., Dickerson, S., Walter, M.R., Banerjee, A.G., Teller, S., Roy, N.: Understanding natural language commands for robotic navigation and mobile manipulation. In: Proc. of AAAI, San Francisco, CA, pp. 1507–1514 (2011)
Google Scholar
Thomas, B., Jenkins, O.C.: Verb semantics for robot dialog. In: Robotics: Science and Systems Workshop on Grounding Human-Robot Dialog for Spatial Tasks, Los Angeles, CA, USA (June 2011)
Google Scholar
Wang, W., Stolcke, A., Zheng, J.: Reranking machine translation hypotheses with structured and web-based language models. In: IEEE Workshop on ASRU 2007, pp. 159–164. IEEE (2007)
Google Scholar

Download references

Author information

Authors and Affiliations

Dept. of Enterprise Engineering, University of Roma Tor Vergata, Rome, Italy
Roberto Basili
Dept. of Civil Engineering and Computer Science Engineering, University of Roma Tor Vergata, Rome, Italy
Emanuele Bastianelli
Dept. of Electronic Engineering, University of Roma Tor Vergata, Rome, Italy
Giuseppe Castellucci
Dept. of Computer, Control, and Management Engineering, University of Roma La Sapienza, Rome, Italy
Daniele Nardi & Vittorio Perera

Authors

Roberto Basili
View author publications
You can also search for this author in PubMed Google Scholar
Emanuele Bastianelli
View author publications
You can also search for this author in PubMed Google Scholar
Giuseppe Castellucci
View author publications
You can also search for this author in PubMed Google Scholar
Daniele Nardi
View author publications
You can also search for this author in PubMed Google Scholar
Vittorio Perera
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dipartimento di Informatica, Università degli Studi di Torino, via Pessinetto 12, 10149, Torino, Italy
Matteo Baldoni , Cristina Baroglio , Guido Boella & Roberto Micalizio , , &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Basili, R., Bastianelli, E., Castellucci, G., Nardi, D., Perera, V. (2013). Kernel-Based Discriminative Re-ranking for Spoken Command Understanding in HRI. In: Baldoni, M., Baroglio, C., Boella, G., Micalizio, R. (eds) AI*IA 2013: Advances in Artificial Intelligence. AI*IA 2013. Lecture Notes in Computer Science(), vol 8249. Springer, Cham. https://doi.org/10.1007/978-3-319-03524-6_15

Download citation

DOI: https://doi.org/10.1007/978-3-319-03524-6_15
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-03523-9
Online ISBN: 978-3-319-03524-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics