skip to main content
10.1145/3434074.3447229acmconferencesArticle/Chapter ViewAbstractPublication PageshriConference Proceedingsconference-collections
short-paper

Identification and Engagement of Passive Subjects in Multiparty Conversations by a Humanoid Robot

Authors Info & Claims
Published:08 March 2021Publication History

ABSTRACT

In this work, we present a novel human-robot interaction (HRI) method to detect and engage passive subjects in multiparty conversations using a humanoid robot. Voice activity detection and speaker localization are combined with facial recognition to detect and identify non-participating subjects. Once a non-participating individual is identified, the robot addresses the subject with a fact related to the topic of the conversation, with the goal of promoting the subject to join the conversation. To prompt sentences related to the topic of the conversation, automatic speech recognition and natural language processing techniques are employed. Preliminary experiments demonstrate that the method successfully identifies and engages passive subjects in a conversation.

References

  1. Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov. 2017. Enriching Word Vectors with Subword Information. Transactions of the Association for Computational Linguistics 5 (2017), 135--146.Google ScholarGoogle ScholarCross RefCross Ref
  2. Cynthia Breazeal. 2003. Toward sociable robots. Robotics and Autonomous Systems 42, 3--4 (2003), 167--175. https://doi.org/10.1016/S0921-8890(02)00373-1Google ScholarGoogle ScholarCross RefCross Ref
  3. Daniel Cer, Yinfei Yang, Sheng-yi Kong, Nan Hua, Nicole Limtiaco, Rhomni St. John, Noah Constant, Mario Guajardo-Cespedes, Steve Yuan, Chris Tar, Yun-Hsuan Sung, Brian Strope, and Ray Kurzweil. 2018. Universal Sentence Encoder. EMNLP 2018 - Conference on Empirical Methods in Natural Language Processing: System Demonstrations, Proceedings (3 2018), 169--174. https://doi.org/10.18653/ v1/d18-2029Google ScholarGoogle Scholar
  4. Alexis Conneau, Douwe Kiela, Holger Schwenk, Loic Barrault, and Antoine Bordes. 2017. Supervised Learning of Universal Sentence Representations from Natural Language Inference Data. (2017).Google ScholarGoogle Scholar
  5. Jacob Devlin, Ming Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. NAACL HLT 2019 - 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies - Proceedings of the Conference 1, Mlm (2019), 4171--4186.Google ScholarGoogle Scholar
  6. J.J. Godfrey, E.C. Holliman, and J. McDaniel. 1992. SWITCHBOARD: telephone speech corpus for research and development. In [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing. IEEE, 517--520. https://doi.org/10.1109/ICASSP.1992.225858Google ScholarGoogle ScholarCross RefCross Ref
  7. Shogo Ikari, Yuichiro Yoshikawa, and Hiroshi Ishiguro. 2020. Multiple-Robot Mediated Discussion System to support group discussion. 29th IEEE International Conference on Robot and Human Interactive Communication, RO-MAN 2020 (2020), 495--502. https://doi.org/10.1109/RO-MAN47096.2020.9223444Google ScholarGoogle ScholarCross RefCross Ref
  8. C. Knapp and G. Carter. 1976. The generalized correlation method for estimation of time delay. IEEE Transactions on Acoustics, Speech, and Signal Processing 24, 4 (8 1976), 320--327. https://doi.org/10.1109/TASSP.1976.1162830Google ScholarGoogle ScholarCross RefCross Ref
  9. Yoichi Matsuyama, Iwao Akiba, Shinya Fujie, and Tetsunori Kobayashi. 2015. Four-participant group conversation: A facilitation robot controlling engagement density as the fourth participant. Computer Speech and Language 33, 1 (2015), 1--24. https://doi.org/10.1016/j.csl.2014.12.001Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Yoichi Matsuyama, Hikaru Taniyama, Shinya Fujie, and Tetsunori Kobayashi. 2010. Framework of communication activation robot participating in multiparty conversation. AAAI Fall Symposium - Technical Report FS-10-05, September 2017 (2010), 68--73.Google ScholarGoogle Scholar
  11. Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient Estimation of Word Representations in Vector Space. https://journal.cecyf.fr/ojs/index.php/cybin/article/view/9Google ScholarGoogle Scholar
  12. Bilge Mutlu, Toshiyuki Shiwa, Takayuki Kanda, Hiroshi Ishiguro, and Norihiro Hagita. 2009. Footing in human-robot conversations. 2, 1 (2009), 61. https: //doi.org/10.1145/1514095.1514109Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Catharine Oertel, Ginevra Castellano, Mohamed Chetouani, Jauwairia Nasir, Mohammad Obaid, Catherine Pelachaud, and Christopher Peters. 2020. Engagement in Human-Agent Interaction: An Overview. Frontiers in Robotics and AI 7, August (2020), 1--21. https://doi.org/10.3389/frobt.2020.00092Google ScholarGoogle Scholar
  14. Jeffrey Pennington, Richard Socher, and Christopher Manning. 2014. Glove: Global Vectors for Word Representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, Stroudsburg, PA, USA, 1532--1543. https://doi.org/10. 3115/v1/D14-1162Google ScholarGoogle ScholarCross RefCross Ref
  15. Abhishek Sehgal and Nasser Kehtarnavaz. 2018. A Convolutional Neural Network Smartphone App for Real-Time Voice Activity Detection. IEEE Access 6 (2018), 9017--9026. https://doi.org/10.1109/ACCESS.2018.2800728Google ScholarGoogle ScholarCross RefCross Ref
  16. Candace L. Sidner, Christopher Lee, Cory D. Kidd, Neal Lesh, and Charles Rich. 2005. Explorations in engagement for humans and robots. Artificial Intelligence 166, 1--2 (2005), 140--164. https://doi.org/10.1016/j.artint.2005.03.005Google ScholarGoogle ScholarCross RefCross Ref
  17. Richard Socher, Alex Perelygin, Jean Y. Wu, Jason Chuang, Christopher Manning, Andrew Y. Ng, and Christopher Potts. 2013. Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. 1631--1642. https://www.aclweb.org/anthology/D13--1170Google ScholarGoogle Scholar
  18. Margaret L. Traeger, Sarah Strohkorb Sebo, Malte Jung, Brian Scassellati, and Nicholas A. Christakis. 2020. Vulnerable robots positively shape human conversational dynamics in a human-robot team. Proceedings of the National Academy of Sciences of the United States of America 117, 12 (2020), 6370--6375. https://doi.org/10.1073/pnas.1910402117Google ScholarGoogle ScholarCross RefCross Ref
  19. Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Advances in Neural Information Processing Systems 2017-Decem, Nips (2017), 5999--6009.Google ScholarGoogle Scholar
  20. Marynel Vázquez, Elizabeth J. Carter, Braden McDorman, Jodi Forlizzi, Aaron Steinfeld, and Scott E. Hudson. 2017. Towards Robot Autonomy in Group Conversations: Understanding the Effects of Body Orientation and Gaze. ACM/IEEE International Conference on Human-Robot Interaction Part F1271 (2017), 42--52. https://doi.org/10.1145/2909824.3020207Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Identification and Engagement of Passive Subjects in Multiparty Conversations by a Humanoid Robot

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          HRI '21 Companion: Companion of the 2021 ACM/IEEE International Conference on Human-Robot Interaction
          March 2021
          756 pages
          ISBN:9781450382908
          DOI:10.1145/3434074
          • General Chairs:
          • Cindy Bethel,
          • Ana Paiva,
          • Program Chairs:
          • Elizabeth Broadbent,
          • David Feil-Seifer,
          • Daniel Szafir

          Copyright © 2021 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 8 March 2021

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • short-paper

          Acceptance Rates

          Overall Acceptance Rate192of519submissions,37%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader