research-article

Open Access

Clarifying commands with information-theoretic human-robot dialog

Authors:
Robin Deits

Battelle Memorial Institute

Battelle Memorial Institute
View Profile

,
Stefanie Tellex

MIT Computer Science and Artificial Intelligence Laboratory

MIT Computer Science and Artificial Intelligence Laboratory
View Profile

,
Pratiksha Thaker

MIT Computer Science and Artificial Intelligence Laboratory

MIT Computer Science and Artificial Intelligence Laboratory
View Profile

,
Dimitar Simeonov

MIT Computer Science and Artificial Intelligence Laboratory

MIT Computer Science and Artificial Intelligence Laboratory
View Profile

,
Thomas Kollar

Carnegie Mellon University

Carnegie Mellon University
View Profile

,
Nicholas Roy

MIT Computer Science and Artificial Intelligence Laboratory

MIT Computer Science and Artificial Intelligence Laboratory
View Profile

Authors Info & Claims

Journal of Human-Robot Interaction Volume 2 Issue 2June 2013pp 58–79https://doi.org/10.5898/JHRI.2.2.Deits

Published:18 June 2013Publication History

Journal of Human-Robot Interaction

Abstract

Our goal is to improve the efficiency and effectiveness of natural language communication between humans and robots. Human language is frequently ambiguous, and a robot's limited sensing makes complete understanding of a statement even more difficult. To address these challenges, we describe an approach for enabling a robot to engage in clarifying dialog with a human partner, just as a human might do in a similar situation. Given an unconstrained command from a human operator, the robot asks one or more questions and receives natural language answers from the human. We apply an information-theoretic approach to choosing questions for the robot to ask. Specifically, we choose the type and subject of questions in order to maximize the reduction in Shannon entropy of the robot's mapping between language and entities in the world. Within the framework of the G³ graphical model, we derive a method to estimate this entropy reduction, choose the optimal question to ask, and merge the information gained from the human operator's answer. We demonstrate that this improves the accuracy of command understanding over prior work while asking fewer questions as compared to baseline question-selection strategies.

References

Bauer, A., Klasing, K., Lidoris, G., Mühlbauer, Q., Rohrmüller, F., Sosnowski, S., et al. (2009, April). The Autonomous City Explorer: Towards natural human-robot interaction in urban environments. International Journal of Social Robotics, 1(2), 127--140Google ScholarCross Ref
Cantrell, R., Talamadupula, K., Schermerhorn, P., Benton, J., Kambhampati, S., & Scheutz, M. (2012). Tell me when and why to do it!: Run-time planner model updates via natural language instruction. In Proceedings of the ACM/IEEE International Conference on Human-Robot Interaction (HRI) (pp. 471--478). New York, NY, USA: ACM. Google ScholarDigital Library
Chen, D. L., & Mooney, R. J. (2011). Learning to interpret natural language navigation instructions from observations. In Proceedings of the National Conference on Artificial Intelligence (AAAI) (pp. 859--865). Google ScholarDigital Library
Cohen, P., & Oviatt, S. (1995). The role of voice input for human-machine communication. In Proceedings of the National Academy of Sciences (Vol. 92, pp. 9921--9927). National Academy Sciences.Google ScholarCross Ref
Doshi, F., & Roy, N. (2008). Spoken language interaction with model uncertainty: An adaptive human-robot interaction system. Connection Science, 20(4), 299--319. Google ScholarDigital Library
Dzifcak, J., Scheutz, M., Baral, C., & Schermerhorn, P. (2009). What to do and how to do it: Translating natural language directives into temporal and dynamic logic representation for goal management and action execution. In IEEE International Conference on Robotics and Automation (pp. 4163--4168). Google ScholarDigital Library
Fong, T., Nourbakhsh, I., & Dautenhahn, K. (2003). A survey of socially interactive robots. Robotics and autonomous systems, 42(3), 143--166.Google Scholar
Hsiao, K., Tellex, S., Vosoughi, S., Kubat, R., & Roy, D. (2008). Object schemas for grounding language in a responsive robot. Connection Science, 20(4), 253--276. Google ScholarDigital Library
Jackendoff, R. S. (1985). Semantics and cognition (Vol. 8). MIT Press.Google Scholar
Jurafsky, D., & Martin, J. H. (2008). Speech and language processing (2 ed.). Englewood Cliffs, New Jersey: Pearson Prentice Hall. Google ScholarDigital Library
Kollar, T., Tellex, S., Roy, D., & Roy, N. (2010). Toward understanding natural language directions. In Proceedings of the ACM/IEEE International Conference on Human-Robot Interaction (HRI) (pp. 259--266). Google ScholarDigital Library
Kwiatkowski, T., Zettlemoyer, L., Goldwater, S., & Steedman, M. (2010). Inducing probabilistic CCG grammars from logical form with higher-order unification. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (pp. 1223--1233). Google ScholarDigital Library
MacMahon, M., Stankiewicz, B., & Kuipers, B. (2006). Walk the talk: Connecting language, knowledge, and action in route instructions. In Proceedings of the National Conference on Artificial Intelligence (AAAI) (pp. 1475--1482). Google ScholarDigital Library
Marneffe, M. de, MacCartney, B., & Manning, C. (2006). Generating typed dependency parses from phrase structure parses. In Proceedings of the International Conference on Language Resources and Evaluation (LREC) (pp. 449--454). Genoa, Italy.Google Scholar
Matuszek, C., Fox, D., & Koscher, K. (2010). Following directions using statistical machine translation. In Proceedings of the ACM/IEEE International Conference on Human-Robot Interaction (HRI) (pp. 251--258). Google ScholarDigital Library
Matuszek, C., Herbst, E., Zettlemoyer, L., & Fox, D. (2012). Learning to parse natural language commands to a robot control system. In Proceedings of the International Symposium on Experimental Robotics (ISER). Quebec City, Canada.Google Scholar
McCallum, A. K. (2002). MALLET: A machine learning for language toolkit. http://mallet.cs.umass.edu.Google Scholar
Piantadosi, S., Goodman, N., Ellis, B., & Tenenbaum, J. (2008). A Bayesian model of the acquisition of compositional semantics. In Proceedings of the Thirtieth Annual Conference of the Cognitive Science Society (pp. 1620--1625).Google Scholar
Rosenthal, S., Veloso, M., & Dey, A. K. (2011). Learning accuracy and availability of humans who help mobile robots. In Proceedings of the National Conference on Artificial Intelligence (AAAI) (pp. 1501--1506). Google ScholarDigital Library
Roy, N., Pineau, J., & Thrun, S. (2000). Spoken dialogue management using probabilistic reasoning. In Proceedings of the 38th annual meeting of the association for computational linguistics (ACL-2000). Google ScholarDigital Library
Severinson-Eklundh, K., Green, A., & Hüttenrauch, H. (2003). Social and collaborative aspects of interaction with a service robot. Robotics and Autonomous Systems, 42(3), 223--234.Google ScholarCross Ref
Shimizu, N., & Haas, A. (2009). Learning to follow navigational route instructions. In Proceedings of the 21st International Joint Conference on Artifical Intelligence (pp. 1488--1493). Google ScholarDigital Library
Simeonov, D., Tellex, S., Kollar, T., & Roy, N. (2011). Toward interpreting spatial language discourse with grounding graphs. In RSS Workshop on Grounding Human-Robot Dialog for Spatial Tasks. Los Angeles, CA.Google Scholar
Skubic, M., Perzanowski, D., Blisard, S., Schultz, A., Adams, W., Bugajska, M., & Brock, D. (2004). Spatial language for human-robot dialogs. IEEE Trans. on Systems, Man, and Cybernetics, Part C: Applications and Reviews, 34(2), 154--167. Google ScholarDigital Library
Stoyanov, V., Cardie, C., Gilbert, N., Riloff, E., Buttler, D., & Hysom, D. (2010, April). Reconcile: A coreference resolution research platform (Tech. Rep.). Cornell University.Google Scholar
Tellex, S. (2010). Natural Language and Spatial Reasoning. Ph.D. Thesis, Massachusetts Institute of Technology, Cambridge, MA. Google ScholarDigital Library
Tellex, S., Kollar, T., Dickerson, S., Walter, M. R., Banerjee, A., Teller, S., et al. (2011). Understanding natural language commands for robotic navigation and mobile manipulation. In Proceedings of the national conference on artificial intelligence (aaai). San Francisco, CA. Google ScholarDigital Library
Tellex, S., Kollar, T., Dickerson, S., Walter, M. R., Banerjee, A. G., Teller, S., et al. (2011). Approaching the symbol grounding problem with probabilistic graphical models. AI Magazine, 32(4), 64--76.Google ScholarCross Ref
Tellex, S., Thaker, P., Deits, R., Kollar, T., & Roy, N. (2012, July). Toward information theoretic human-robot dialog. In Proceedings of Robotics: Science and Systems. Sydney, Australia.Google ScholarCross Ref
Thompson, C. A., & Mooney, R. J. (2003). Acquiring word-meaning mappings for natural language interfaces. Journal of Artificial Intelligence Research, 18, 1--44. Google ScholarDigital Library
Vogel, A., & Jurafsky, D. (2010). Learning to follow navigational directions. In Proceedings of the Association for Computational Linguistics (pp. 806--814). Google ScholarDigital Library
Williams, J. D., & Young, S. (2007a, April). Partially observable Markov decision processes for spoken dialog systems. Computer Speech & Language, 21(2), 393--422. Google ScholarDigital Library
Williams, J. D., & Young, S. (2007b, September). Scaling POMDPs for spoken dialog management. IEEE Transactions on Audio, Speech, and Language Processing, 15(7), 2116--2129. Google ScholarDigital Library
Winograd, T. (1971). Procedures as a Representation for Data in a Computer Program for Understanding Natural Language. Ph.D. Thesis, Massachusetts Institute of Technology, Cambridge, MA.Google Scholar
Wong, Y., & Mooney, R. (2007). Learning synchronous grammars for semantic parsing with lambda calculus. In Association for computational linguistics (Vol. 45, p. 960).Google Scholar
Young, S. (2006). Using POMDPs for dialog management. In IEEE Spoken Language Technology Workshop (pp. 8--13).Google ScholarCross Ref
Zettlemoyer, L. S., & Collins, M. (2005). Learning to map sentences to logical form: Structured classification with probabilistic categorial grammars. In Proceedings of the Conference on Uncertainty in Artificial Intelligence) (pp. 658--666). Google ScholarDigital Library

Index Terms

Clarifying commands with information-theoretic human-robot dialog
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing

Index terms have been assigned to the content through auto-classification.

Recommendations

Miscommunication Detection and Recovery in Situated Human–Robot Dialogue

Even without speech recognition errors, robots may face difficulties interpreting natural-language instructions. We present a method for robustly handling miscommunication between people and robots in task-oriented spoken dialogue. This capability is ...
Read More
Human-robot collaborative tutoring using multiparty multimodal spoken dialogue
HRI '14: Proceedings of the 2014 ACM/IEEE international conference on Human-robot interaction

In this paper, we describe a project that explores a novel experimental setup towards building a spoken, multi-modally rich, and human-like multiparty tutoring robot. A human-robot interaction setup is designed, and a human-human dialogue corpus is ...
Read More
Lexical Entrainment in Multi-party Human–Robot Interaction
Social Robotics
Abstract
This paper reports lexical entrainment in a multi-party human–robot interaction, wherein one robot and two humans serve as participants. Humans tend to use the same terms as their interlocutors while making conversation. This phenomenon is called ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

Journal of Human-Robot Interaction Volume 2, Issue 2
Special Issue on Technical and Social Advances in HRI: An Invitational Issue of JHRI
June 2013
129 pages
EISSN:2163-0364
Issue’s Table of Contents
Sponsors
In-Cooperation
Publisher
Journal of Human-Robot Interaction Steering Committee
Publication History
- Published: 18 June 2013
Author Tags
dialog
human-robot interaction
information theory
natural language
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 11
  Total Citations
  View Citations
- 420
  Total Downloads
- Downloads (Last 12 months)95
- Downloads (Last 6 weeks)13
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Clarifying commands with information-theoretic human-robot dialog

Journal of Human-Robot Interaction

Abstract

References

Cited By

Index Terms

Recommendations

Miscommunication Detection and Recovery in Situated Human–Robot Dialogue

Human-robot collaborative tutoring using multiparty multimodal spoken dialogue

Lexical Entrainment in Multi-party Human–Robot Interaction

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Clarifying commands with information-theoretic human-robot dialog

Journal of Human-Robot Interaction

Abstract

References

Cited By

Index Terms

Recommendations

Miscommunication Detection and Recovery in Situated Human–Robot Dialogue

Human-robot collaborative tutoring using multiparty multimodal spoken dialogue

Lexical Entrainment in Multi-party Human–Robot Interaction

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media