short-paper

Identification and Engagement of Passive Subjects in Multiparty Conversations by a Humanoid Robot

Authors:
David Ayllon

UBTECH North America Research and Development Center, Pasadena, CA, USA

UBTECH North America Research and Development Center, Pasadena, CA, USA
View Profile

,
Ting-Shuo Chou

UBTECH North America Research and Development Center, Pasadena, CA, USA

UBTECH North America Research and Development Center, Pasadena, CA, USA
View Profile

,
Adam King

UBTECH North America Research and Development Center, Pasadena, CA, USA

UBTECH North America Research and Development Center, Pasadena, CA, USA
View Profile

,
Yang Shen

UBTECH North America Research and Development Center, Pasadena, CA, USA

UBTECH North America Research and Development Center, Pasadena, CA, USA
View Profile

HRI '21 Companion: Companion of the 2021 ACM/IEEE International Conference on Human-Robot InteractionMarch 2021Pages 535–539https://doi.org/10.1145/3434074.3447229

Published:08 March 2021Publication History

HRI '21 Companion: Companion of the 2021 ACM/IEEE International Conference on Human-Robot Interaction

Pages 535–539

ABSTRACT

In this work, we present a novel human-robot interaction (HRI) method to detect and engage passive subjects in multiparty conversations using a humanoid robot. Voice activity detection and speaker localization are combined with facial recognition to detect and identify non-participating subjects. Once a non-participating individual is identified, the robot addresses the subject with a fact related to the topic of the conversation, with the goal of promoting the subject to join the conversation. To prompt sentences related to the topic of the conversation, automatic speech recognition and natural language processing techniques are employed. Preliminary experiments demonstrate that the method successfully identifies and engages passive subjects in a conversation.

References

Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov. 2017. Enriching Word Vectors with Subword Information. Transactions of the Association for Computational Linguistics 5 (2017), 135--146.Google ScholarCross Ref
Cynthia Breazeal. 2003. Toward sociable robots. Robotics and Autonomous Systems 42, 3--4 (2003), 167--175. https://doi.org/10.1016/S0921-8890(02)00373-1Google ScholarCross Ref
Daniel Cer, Yinfei Yang, Sheng-yi Kong, Nan Hua, Nicole Limtiaco, Rhomni St. John, Noah Constant, Mario Guajardo-Cespedes, Steve Yuan, Chris Tar, Yun-Hsuan Sung, Brian Strope, and Ray Kurzweil. 2018. Universal Sentence Encoder. EMNLP 2018 - Conference on Empirical Methods in Natural Language Processing: System Demonstrations, Proceedings (3 2018), 169--174. https://doi.org/10.18653/ v1/d18-2029Google Scholar
Alexis Conneau, Douwe Kiela, Holger Schwenk, Loic Barrault, and Antoine Bordes. 2017. Supervised Learning of Universal Sentence Representations from Natural Language Inference Data. (2017).Google Scholar
Jacob Devlin, Ming Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. NAACL HLT 2019 - 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies - Proceedings of the Conference 1, Mlm (2019), 4171--4186.Google Scholar
J.J. Godfrey, E.C. Holliman, and J. McDaniel. 1992. SWITCHBOARD: telephone speech corpus for research and development. In [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing. IEEE, 517--520. https://doi.org/10.1109/ICASSP.1992.225858Google ScholarCross Ref
Shogo Ikari, Yuichiro Yoshikawa, and Hiroshi Ishiguro. 2020. Multiple-Robot Mediated Discussion System to support group discussion. 29th IEEE International Conference on Robot and Human Interactive Communication, RO-MAN 2020 (2020), 495--502. https://doi.org/10.1109/RO-MAN47096.2020.9223444Google ScholarCross Ref
C. Knapp and G. Carter. 1976. The generalized correlation method for estimation of time delay. IEEE Transactions on Acoustics, Speech, and Signal Processing 24, 4 (8 1976), 320--327. https://doi.org/10.1109/TASSP.1976.1162830Google ScholarCross Ref
Yoichi Matsuyama, Iwao Akiba, Shinya Fujie, and Tetsunori Kobayashi. 2015. Four-participant group conversation: A facilitation robot controlling engagement density as the fourth participant. Computer Speech and Language 33, 1 (2015), 1--24. https://doi.org/10.1016/j.csl.2014.12.001Google ScholarDigital Library
Yoichi Matsuyama, Hikaru Taniyama, Shinya Fujie, and Tetsunori Kobayashi. 2010. Framework of communication activation robot participating in multiparty conversation. AAAI Fall Symposium - Technical Report FS-10-05, September 2017 (2010), 68--73.Google Scholar
Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient Estimation of Word Representations in Vector Space. https://journal.cecyf.fr/ojs/index.php/cybin/article/view/9Google Scholar
Bilge Mutlu, Toshiyuki Shiwa, Takayuki Kanda, Hiroshi Ishiguro, and Norihiro Hagita. 2009. Footing in human-robot conversations. 2, 1 (2009), 61. https: //doi.org/10.1145/1514095.1514109Google ScholarDigital Library
Catharine Oertel, Ginevra Castellano, Mohamed Chetouani, Jauwairia Nasir, Mohammad Obaid, Catherine Pelachaud, and Christopher Peters. 2020. Engagement in Human-Agent Interaction: An Overview. Frontiers in Robotics and AI 7, August (2020), 1--21. https://doi.org/10.3389/frobt.2020.00092Google Scholar
Jeffrey Pennington, Richard Socher, and Christopher Manning. 2014. Glove: Global Vectors for Word Representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, Stroudsburg, PA, USA, 1532--1543. https://doi.org/10. 3115/v1/D14-1162Google ScholarCross Ref
Abhishek Sehgal and Nasser Kehtarnavaz. 2018. A Convolutional Neural Network Smartphone App for Real-Time Voice Activity Detection. IEEE Access 6 (2018), 9017--9026. https://doi.org/10.1109/ACCESS.2018.2800728Google ScholarCross Ref
Candace L. Sidner, Christopher Lee, Cory D. Kidd, Neal Lesh, and Charles Rich. 2005. Explorations in engagement for humans and robots. Artificial Intelligence 166, 1--2 (2005), 140--164. https://doi.org/10.1016/j.artint.2005.03.005Google ScholarCross Ref
Richard Socher, Alex Perelygin, Jean Y. Wu, Jason Chuang, Christopher Manning, Andrew Y. Ng, and Christopher Potts. 2013. Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. 1631--1642. https://www.aclweb.org/anthology/D13--1170Google Scholar
Margaret L. Traeger, Sarah Strohkorb Sebo, Malte Jung, Brian Scassellati, and Nicholas A. Christakis. 2020. Vulnerable robots positively shape human conversational dynamics in a human-robot team. Proceedings of the National Academy of Sciences of the United States of America 117, 12 (2020), 6370--6375. https://doi.org/10.1073/pnas.1910402117Google ScholarCross Ref
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Advances in Neural Information Processing Systems 2017-Decem, Nips (2017), 5999--6009.Google Scholar
Marynel Vázquez, Elizabeth J. Carter, Braden McDorman, Jodi Forlizzi, Aaron Steinfeld, and Scott E. Hudson. 2017. Towards Robot Autonomy in Group Conversations: Understanding the Effects of Body Orientation and Gaze. ACM/IEEE International Conference on Human-Robot Interaction Part F1271 (2017), 42--52. https://doi.org/10.1145/2909824.3020207Google ScholarDigital Library

Index Terms

Identification and Engagement of Passive Subjects in Multiparty Conversations by a Humanoid Robot
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
2. Human-centered computing
  1. Human computer interaction (HCI)
    1. HCI design and evaluation methods
    2. Interaction devices

Recommendations

Footing in human-robot conversations: how robots might shape participant roles using gaze cues
HRI '09: Proceedings of the 4th ACM/IEEE international conference on Human robot interaction

During conversations, speakers establish their and others' participant roles (who participates in the conversation and in what capacity)--or "footing" as termed by Goffman-using gaze cues. In this paper, we study how a robot can establish the ...
Read More
Estimating user's engagement from eye-gaze behaviors in human-agent conversations
IUI '10: Proceedings of the 15th international conference on Intelligent user interfaces

In face-to-face conversations, speakers are continuously checking whether the listener is engaged in the conversation and change the conversational strategy if the listener is not fully engaged in the conversation. With the goal of building a ...
Read More
Human-robot collaborative tutoring using multiparty multimodal spoken dialogue
HRI '14: Proceedings of the 2014 ACM/IEEE international conference on Human-robot interaction

In this paper, we describe a project that explores a novel experimental setup towards building a spoken, multi-modally rich, and human-like multiparty tutoring robot. A human-robot interaction setup is designed, and a human-human dialogue corpus is ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
HRI '21 Companion: Companion of the 2021 ACM/IEEE International Conference on Human-Robot Interaction
March 2021
756 pages
ISBN:9781450382908
DOI:10.1145/3434074
General Chairs:
Cindy Bethel
Mississippi State University, USA
,
Ana Paiva
INESC-ID, IST, University of Lisbon, Portugal & Radcliffe Institute for Advanced Study, Harvard University, USA
,
Program Chairs:
Elizabeth Broadbent
University of Auckland, New Zealand
,
David Feil-Seifer
University of Nevada Reno, USA
,
Daniel Szafir
University of Colorado Boulder, USA
Copyright © 2021 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 8 March 2021
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
conversational engagement
conversational participation
natural language processing
spoken dialogue systems
Qualifiers
- short-paper
Conference

Acceptance Rates
Overall Acceptance Rate192of519submissions,37%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 3
  Total Citations
  View Citations
- 148
  Total Downloads
- Downloads (Last 12 months)25
- Downloads (Last 6 weeks)3
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Identification and Engagement of Passive Subjects in Multiparty Conversations by a Humanoid Robot

HRI '21 Companion: Companion of the 2021 ACM/IEEE International Conference on Human-Robot Interaction

ABSTRACT

References

Cited By

Index Terms

Recommendations

Footing in human-robot conversations: how robots might shape participant roles using gaze cues

Estimating user's engagement from eye-gaze behaviors in human-agent conversations

Human-robot collaborative tutoring using multiparty multimodal spoken dialogue