Abstract
Speech recognition is being addressed as one of the key technologies for a natural interaction with robots, that are targeting in the consumer market. However, speech recognition in human-robot interaction is typically affected by noisy conditions of the operational environment, that impact on the performance of the recognition of spoken commands. Consequently, finite-state grammars or statistical language models even though they can be tailored to the target domain exhibit high rate of false positives or low accuracy. In this paper, a discriminative re-ranking method is applied to a simple speech and language processing cascade, based on off-the-shelf components in realistic conditions. Tree kernels are here applied to improve the accuracy of the recognition process by re-ranking the n-best list returned by the speech recognition component. The rationale behind our approach is to reduce the effort for devising domain dependent solutions in the design of speech interfaces for language processing in human-robot interactions.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Baroni, M., Bernardini, S., Ferraresi, A., Zanchetta, E.: The wacky wide web: a collection of very large linguistically processed web-crawled corpora. LRE 43(3), 209–226 (2009)
Cantrell, R., Scheutz, M., Schermerhorn, P., Wu, X.: Robust spoken instruction understanding for HRI. In: 2010 5th ACM/IEEE International Conference on HRI, pp. 275–282 (March 2010)
Collins, M., Duffy, N.: New ranking algorithms for parsing and tagging: kernels over discrete structures, and the voted perceptron. In: Proc. of the 40th Annual Meeting on ACL, ACL 2002, pp. 263–270. Association for Computational Linguistics, Stroudsburg (2002), http://dx.doi.org/10.3115/1073083.1073128
Croce, D., Moschitti, A., Basili, R.: Structured lexical similarity via convolution kernels on dependency trees. In: EMNLP, pp. 1034–1046 (2011)
Dinarelli, M., Moschitti, A., Riccardi, G.: Discriminative reranking for spoken language understanding. IEEE Transactions on Audio, Speech & Language Processing 20(2), 526–539 (2012)
Doostdar, M., Schiffer, S., Lakemeyer, G.: A robust speech recognition system for service-robotics applications. In: Iocchi, L., Matsubara, H., Weitzenfeld, A., Zhou, C. (eds.) RoboCup 2008. LNCS, vol. 5399, pp. 1–12. Springer, Heidelberg (2009)
Dzifcak, J., Scheutz, M., Baral, C., Schermerhorn, P.: What to do and how to do it: Translating natural language directives into temporal and dynamic logic representation for goal management and action execution. In: Proc. ICRA 2009, Kobe, Japan (May 2009)
Golub, G., Kahan, W.: Calculating the singular values and pseudo-inverse of a matrix. J. Soc. Ind. Appl. Math.: Series B, Numerical Analysis (1965)
Huang, A.S., Tellex, S., Bachrach, A., Kollar, T., Roy, D., Roy, N.: Natural language command of an autonomous micro-air vehicle. In: IROS, Taipei, Taiwan, pp. 2663–2669 (October 2010)
Special Issue: Dialogue with Robots, vol. 34(2) (2011)
Klein, D., Manning, C.D.: Accurate unlexicalized parsing. In: Proc. of ACL 2003, pp. 423–430 (2003)
Kollar, T., Tellex, S., Roy, D., Roy, N.: Toward understanding natural language directions. In: 2010 5th ACM/IEEE International Conference on HRI (2010)
Landauer, T., Dumais, S.: A solution to plato’s problem: The latent semantic analysis theory of acquisition, induction and representation of knowledge. Psychological Review 104 (1997)
Morbini, F., Audhkhasi, K., Artstein, R., Segbroeck, M.V., Sagae, K., Georgiou, P.G., Traum, D.R., Narayanan, S.S.: A reranking approach for recognition and classification of speech input in conversational dialogue systems. In: SLT, pp. 49–54 (2012)
Moschitti, A.: Efficient convolution kernels for dependency and constituent syntactic trees. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 318–329. Springer, Heidelberg (2006)
Shawe-Taylor, J., Cristianini, N.: Kernel Methods for Pattern Analysis. Cambridge University Press (2004)
Shen, L., Sarkar, A., Och, F.: Discriminative reranking for machine translation. In: Proc. of HLT-NAACL 2004, pp. 177–184 (2004)
Shen, L., Joshi, A.K.: An svm based voting algorithm with application to parse reranking. In: Proc. of HLT-NAACL 2003, CONLL 2003, vol. 4, pp. 9–16. ACL, Stroudsburg (2003)
Taylor, M., Zaragoza, H., Craswell, N., Robertson, S., Burges, C.: Optimisation methods for ranking functions with multiple parameters. In: Proc. of the 15th ACM International CIKM, CIKM 2006, pp. 585–593. ACM, New York (2006)
Tellex, S., Kollar, T., Dickerson, S., Walter, M.R., Banerjee, A.G., Teller, S., Roy, N.: Understanding natural language commands for robotic navigation and mobile manipulation. In: Proc. of AAAI, San Francisco, CA, pp. 1507–1514 (2011)
Thomas, B., Jenkins, O.C.: Verb semantics for robot dialog. In: Robotics: Science and Systems Workshop on Grounding Human-Robot Dialog for Spatial Tasks, Los Angeles, CA, USA (June 2011)
Wang, W., Stolcke, A., Zheng, J.: Reranking machine translation hypotheses with structured and web-based language models. In: IEEE Workshop on ASRU 2007, pp. 159–164. IEEE (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer International Publishing Switzerland
About this paper
Cite this paper
Basili, R., Bastianelli, E., Castellucci, G., Nardi, D., Perera, V. (2013). Kernel-Based Discriminative Re-ranking for Spoken Command Understanding in HRI. In: Baldoni, M., Baroglio, C., Boella, G., Micalizio, R. (eds) AI*IA 2013: Advances in Artificial Intelligence. AI*IA 2013. Lecture Notes in Computer Science(), vol 8249. Springer, Cham. https://doi.org/10.1007/978-3-319-03524-6_15
Download citation
DOI: https://doi.org/10.1007/978-3-319-03524-6_15
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-03523-9
Online ISBN: 978-3-319-03524-6
eBook Packages: Computer ScienceComputer Science (R0)