Abstract
Nonverbal communication helps to compensate many hidden meanings omitted from spoken language. In some situations, nonverbal communication may even take the place of verbal communication. In this paper, we introduce how nonverbal acoustic communication can be utilized in human-computer interaction. We first overview acoustic techniques for nonverbal communication. Then we provide a design framework of nonverbal communication based intelligent agents. We demonstrate an example of the design using an experimental iPad game.
Similar content being viewed by others
References
Basu S (2002) Conversational scene analysis. PhD thesis, Massachusetts Institute of Technology
Cassell J (2001) Embodied conversational agents. AI Mag 22(4): 67–83
Carey MJ, Parris ES, Lloyd-Thomas H, Bennett S, Ensigma G (1996) Robust prosodic features for speaker identification. In: International conference on spoken language, vol 3. pp 1800–1803
Darvishi A, Guggiana V, Munteanu E, Schauer H, Motavalli M, Rauterberg M (1994) Synthesizing non-speech sound to support blind and visually impaired computer users. In: ICCHP ’94: proceedings of the 4th international conference on computers for handicapped persons. Springer-Verlag New York, Inc., Secaucus, NJ, USA, pp 385–393
Gatica-Perez D (2009) Automatic nonverbal analysis of social interaction in small groups: a review. Image Vis Comput 27(12): 1775–1787
Gerosa M, Lee S, Giuliani D, Narayanan S (2006) Analyzing children’s speech: an acoustic study of consonants and consonant-vowel transition. In: Proceedings of ICASSP
Goto M, Muraoka Y (1999) Real-time beat tracking for drumless audio signals: chord change detection for musical decisions. Speech Commun 27(3–4): 311–335
Grimm M, Kroschel K, Mower E, Narayanan S (2007) Primitives-based evaluation and estimation of emotions in speech. Speech Commun 49(10–11): 787–800
Huey WR, Buckley SD, Lerner DN (1994) Audible performance of smoke alarm sounds. In: Proceedings of the human factors and ergonomics society 38th annual meeting, AGING: designing for an aging population [Lecture], vol 1. pp 147–151
Jin Q, Waibel A (2000) Application of lda to speaker recognition. In: Proceedings of the ICSLP-00. pp 250–253
Kaelbling LP, Littman ML, Cassandra AR (1998) Planning and acting in partially observable stochastic domains. Artif Intell 101(1–2): 99–134
Kinnunen T (2005) Optimizing spectral feature based text-independent speaker recognition. PhD thesis, University of Joensuu
Klecková J (2009) Important nonverbal attributes for spontaneous speech recognition. In: ICONS ’09: proceedings of the 2009 fourth international conference on systems. IEEE Computer Society, Washington, DC, USA, pp 13–16
Lee J, Marsella S (2006) Nonverbal behavior generator for embodied conversational agents. In: Proceedings of the 6th international conference on intelligent virtual agents. Springer, Marina del Rey, CA, pp 243–255
Lin Y, Le Z, Park K, Makedon F (2009) Dynamic information fusion and evidence equilibrium. In: IC-AI. CSREA Press, pp 769–775
Lin Y, Le Z, Becker E, Makedon F (2010) Acoustical implicit communication in human-robot interaction. In: Proceedings of the 3nd international conference on pervasive technologies related to assistive environments (PETRA’10), Samos, Greece, June 23–25
Lu H, Pan W, Lane ND, Choudhury T, Campbell AT (2009) Soundsense: scalable sound sensing for people-centric applications on mobile phones. In: MobiSys ’09: proceedings of the 7th international conference on mobile systems, applications, and services. ACM, New York, NY, USA, pp 165–178
Martin A, Mauuary L (2006) Robust speech/non-speech detection based on LDA-derived parameter and voicing parameter for speech recognition in noisy environments. Speech Commun 48(2): 191–206
Picard RA (1997) Affective computing. The MIT Press, Cambridge
Portelo J, Bugalho M, Trancoso I, Neto J, Abad A, Serralheiro A (2009) Non-speech audio event detection. In: ICASSP ’09: proceedings of the 2009 IEEE international conference on acoustics, speech and signal processing. IEEE Computer Society, Washington, DC, USA, pp 1973–1976
Poyatos F (2002) Nonverbal communication across disciplines, vol 2: paralanguage, kinesics, silence, personal and environmental interaction. Benjamins, Amsterdam
Rani P, Sarkar N (2004) Emotion-sensitive robots—a new paradigm for humanrobot interaction. In: IEEE-RAS/RSJ international conference on hu-manoid robots, pp 10–12
Rani P, Sarkar N, Smith C, Adams J (2003) Affective communication for implicit human-machine interaction. In: IEEE international conference on systems, man and cybernetics, vol 5. pp 4896– 4903
Razak AA, Yusof MHM, Komiya R (2003) Towards automatic recognition of emotion in speech. In: Proceedings of IEEE international symposium on signal processing and information technology. pp 548–551
Reynolds DA, Rose RC (1995) Robust text-independent speaker identification using gaussian mixture speaker models. IEEE Trans Speech Audio Process 3(1): 72–83
Ryokai K, Vaucelle C, Cassell J (2002) Literacy learning by storytelling with a virtual peer. In: Proceedings of computer support for collaborative learning
Schuller B, Reiter S, Müller R, Al-Hames M, Lang MK, Rigoll G (2005) Speaker independent speech emotion recognition by ensemble classification. In: ICME, IEEE. pp 864–867
Shin SH, Hashimoto T, Hatano S (2009) Automatic detection system for cough sounds as a symptom of abnormal health condition. IEEE Trans Inf Technol Biomed 13(4): 486–493
Shriberg E (2007) Higher-level features in speaker recognition. pp 241–259
Steels L (2001) Social learning and verbal communication with humanoid robots. In: Proceedings of the IEEE-RAS international conference on humanoid robots. IEEE, New York, NY, USA, pp 335–342
Støy K (2001) Using situated communication in distributed autonomous mobile robotics. In: SCAI. pp 44–52
Sutton RS, Barto AG (1998) Reinforcement learning: an introduction. The MIT Press, Cambridge
Valin J, Michaud F, Rouat J, Ltourneau D (2003) Robust sound source localization using a microphone array on a mobile robot. In: Proceedings international conference on intelligent robots and systems. pp 1228–1233
Vogt T, André E, Wagner J (2008) Automatic recognition of emotions from speech: a review of the literature and recommendations for practical realisation. pp 75–91
Walther BJ, Tidwell CL (1995) Nonverbal cues in computer-mediated communication, and the effect of chronemics on relational communication. J Organ Comput 5(4): 355–378
Woodburn R, Procter R, Arnott LJ, Newell FA (1991) A study of conversational turn-taking in a communication aid for the disabled. In: Proceedings of the HCI’91 conference on people and computers VI, applications. pp 359–371
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Lin, Y., Makedon, F. Nonverbal acoustic communication in human-computer interaction. Artif Intell Rev 35, 319–338 (2011). https://doi.org/10.1007/s10462-010-9196-4
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10462-010-9196-4