Training a Robot via Human Feedback: A Case Study

Knox, W. Bradley; Stone, Peter; Breazeal, Cynthia

doi:10.1007/978-3-319-02675-6_46

Training a Robot via Human Feedback: A Case Study

W. Bradley Knox²²,
Peter Stone²³ &
Cynthia Breazeal²²

Conference paper

7124 Accesses
53 Citations
3 Altmetric

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8239))

Abstract

We present a case study of applying a framework for learning from numeric human feedback—tamer—to a physically embodied robot. In doing so, we also provide the first demonstration of the ability to train multiple behaviors by such feedback without algorithmic modifications and of a robot learning from free-form human-generated feedback without any further guidance or evaluative feedback. We describe transparency challenges specific to a physically embodied robot learning from human feedback and adjustments that address these challenges.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Argall, B., Chernova, S., Veloso, M., Browning, B.: A survey of robot learning from demonstration. Robotics and Autonomous Systems 57(5), 469–483 (2009)
Article Google Scholar
Breazeal, C., Siegel, M., Berlin, M., Gray, J., Grupen, R., Deegan, P., Weber, J., Narendran, K., McBean, J.: Mobile, dexterous, social robots for mobile manipulation and human-robot interaction. In: SIGGRAPH 2008: ACM SIGGRAPH 2008 New Tech Demos (2008)
Google Scholar
Crick, C., Osentoski, S., Jay, G., Jenkins, O.C.: Human and robot perception in large-scale learning from demonstration. In: Proceedings of the 6th International Conference on Human-Robot Interaction, pp. 339–346. ACM (2011)
Google Scholar
Isbell, C., Kearns, M., Singh, S., Shelton, C., Stone, P., Kormann, D.: Cobot in LambdaMOO: An Adaptive Social Statistics Agent. In: Proceedings of The 5th Annual International Conference on Autonomous Agents and Multiagent Systems (AAMAS) (2006)
Google Scholar
Knox, W.B., Stone, P.: Combining manual feedback with subsequent MDP reward signals for reinforcement learning. In: Proceedings of The 9th Annual International Conference on Autonomous Agents and Multiagent Systems (AAMAS) (2010)
Google Scholar
Knox, W.B.: Learning from Human-Generated Reward. PhD thesis, Department of Computer Science, The University of Texas at Austin (August 2012)
Google Scholar
Knox, W.B., Breazeal, C., Stone, P.: Learning from feedback on actions past and intended. In: Proceedings of 7th ACM/IEEE International Conference on Human-Robot Interaction, Late-Breaking Reports Session (HRI 2012) (March 2012)
Google Scholar
Knox, W.B., Glass, B.D., Love, B.C., Maddox, W.T., Stone, P.: How humans teach agents: A new experimental perspective. International Journal of Social Robotics, Special Issue on Robot Learning from Demonstration 4(4), 409–421 (2012)
Article Google Scholar
Knox, W.B., Stone, P.: Interactively shaping agents via human reinforcement: The TAMER framework. In: The 5th International Conference on Knowledge Capture (September 2009)
Google Scholar
Knox, W.B., Stone, P.: Reinforcement learning from human reward: Discounting in episodic tasks. In: 21st IEEE International Symposium on Robot and Human Interactive Communication (Ro-Man) (September 2012)
Google Scholar
Knox, W.B., Stone, P.: Reinforcement learning with human and MDP reward. In: Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems (AAMAS) (June 2012)
Google Scholar
Knox, W.B., Stone, P.: Learning non-myopically from human-generated reward. In: International Conference on Intelligent User Interfaces (IUI) (March 2013)
Google Scholar
León, A., Morales, E.F., Altamirano, L., Ruiz, J.R.: Teaching a robot to perform task through imitation and on-line feedback. In: San Martin, C., Kim, S.-W. (eds.) CIARP 2011. LNCS, vol. 7042, pp. 549–556. Springer, Heidelberg (2011)
Chapter Google Scholar
Li, G., Hung, H., Whiteson, S., Knox, W.B.: Using informative behavior to increase engagement in the TAMER framework (May 2013)
Google Scholar
Nehaniv, C.L., Dautenhahn, K.: 2 the correspondence problem. Imitation in Animals and Artifacts, 41 (2002)
Google Scholar
Pilarski, P., Dawson, M., Degris, T., Fahimi, F., Carey, J., Sutton, R.: Online human training of a myoelectric prosthesis controller via actor-critic reinforcement learning. In: IEEE International Conference on Rehabilitation Robotics (ICORR), pp. 1–7. IEEE (2011)
Google Scholar
Sridharan, M.: Augmented reinforcement learning for interaction with non-expert humans in agent domains. In: Proceedings of IEEE International Conference on Machine Learning Applications (2011)
Google Scholar
Suay, H., Chernova, S.: Effect of human guidance and state space size on interactive reinforcement learning. In: 20th IEEE International Symposium on Robot and Human Interactive Communication (Ro-Man), pp. 1–6 (2011)
Google Scholar
Sutton, R., Barto, A.: Reinforcement Learning: An Introduction. MIT Press (1998)
Google Scholar
Tenorio-Gonzalez, A.C., Morales, E.F., Villaseñor-Pineda, L.: Dynamic reward shaping: Training a robot by voice. In: Kuri-Morales, A., Simari, G.R. (eds.) IBERAMIA 2010. LNCS, vol. 6433, pp. 483–492. Springer, Heidelberg (2010)
Chapter Google Scholar
Thomaz, A., Breazeal, C.: Teachable robots: Understanding human teaching behavior to build more effective robot learners. Artificial Intelligence 172(6-7), 716–737 (2008)
Article Google Scholar
Vien, N.A., Ertel, W.: Reinforcement learning combined with human feedback in continuous state and action spaces. In: 2012 IEEE International Conference on Development and Learning and Epigenetic Robotics (ICDL), pp. 1–6. IEEE (2012)
Google Scholar

Download references

Author information

Authors and Affiliations

Media Lab, Massachusetts Institute of Technology, 20 Ames Street, Cambridge, MA, USA
W. Bradley Knox & Cynthia Breazeal
Dept. of Computer Science, University of Texas at Austin, Austin, TX, USA
Peter Stone

Authors

W. Bradley Knox
View author publications
You can also search for this author in PubMed Google Scholar
Peter Stone
View author publications
You can also search for this author in PubMed Google Scholar
Cynthia Breazeal
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Bristol Robotics Laboratory, University of the West of England, BS348QZ, Bristol, UK
Guido Herrmann
Bristol Robotics Laboratory, University of the West of England, BS161QD, UK
Martin J. Pearson
Bristol Robotics Laboratory, University of the West of England, BS161QY, Bristol, UK
Alexander Lenz , Paul Bremner , Adam Spiers & Ute Leonards , , &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Knox, W.B., Stone, P., Breazeal, C. (2013). Training a Robot via Human Feedback: A Case Study. In: Herrmann, G., Pearson, M.J., Lenz, A., Bremner, P., Spiers, A., Leonards, U. (eds) Social Robotics. ICSR 2013. Lecture Notes in Computer Science(), vol 8239. Springer, Cham. https://doi.org/10.1007/978-3-319-02675-6_46

Download citation

DOI: https://doi.org/10.1007/978-3-319-02675-6_46
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-02674-9
Online ISBN: 978-3-319-02675-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics