Skip to main content
Log in

Nonverbal acoustic communication in human-computer interaction

  • Published:
Artificial Intelligence Review Aims and scope Submit manuscript

Abstract

Nonverbal communication helps to compensate many hidden meanings omitted from spoken language. In some situations, nonverbal communication may even take the place of verbal communication. In this paper, we introduce how nonverbal acoustic communication can be utilized in human-computer interaction. We first overview acoustic techniques for nonverbal communication. Then we provide a design framework of nonverbal communication based intelligent agents. We demonstrate an example of the design using an experimental iPad game.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Basu S (2002) Conversational scene analysis. PhD thesis, Massachusetts Institute of Technology

  • Cassell J (2001) Embodied conversational agents. AI Mag 22(4): 67–83

    Google Scholar 

  • Carey MJ, Parris ES, Lloyd-Thomas H, Bennett S, Ensigma G (1996) Robust prosodic features for speaker identification. In: International conference on spoken language, vol 3. pp 1800–1803

  • Darvishi A, Guggiana V, Munteanu E, Schauer H, Motavalli M, Rauterberg M (1994) Synthesizing non-speech sound to support blind and visually impaired computer users. In: ICCHP ’94: proceedings of the 4th international conference on computers for handicapped persons. Springer-Verlag New York, Inc., Secaucus, NJ, USA, pp 385–393

  • Gatica-Perez D (2009) Automatic nonverbal analysis of social interaction in small groups: a review. Image Vis Comput 27(12): 1775–1787

    Article  Google Scholar 

  • Gerosa M, Lee S, Giuliani D, Narayanan S (2006) Analyzing children’s speech: an acoustic study of consonants and consonant-vowel transition. In: Proceedings of ICASSP

  • Goto M, Muraoka Y (1999) Real-time beat tracking for drumless audio signals: chord change detection for musical decisions. Speech Commun 27(3–4): 311–335

    Article  Google Scholar 

  • Grimm M, Kroschel K, Mower E, Narayanan S (2007) Primitives-based evaluation and estimation of emotions in speech. Speech Commun 49(10–11): 787–800

    Article  Google Scholar 

  • Huey WR, Buckley SD, Lerner DN (1994) Audible performance of smoke alarm sounds. In: Proceedings of the human factors and ergonomics society 38th annual meeting, AGING: designing for an aging population [Lecture], vol 1. pp 147–151

  • Jin Q, Waibel A (2000) Application of lda to speaker recognition. In: Proceedings of the ICSLP-00. pp 250–253

  • Kaelbling LP, Littman ML, Cassandra AR (1998) Planning and acting in partially observable stochastic domains. Artif Intell 101(1–2): 99–134

    Article  MATH  MathSciNet  Google Scholar 

  • Kinnunen T (2005) Optimizing spectral feature based text-independent speaker recognition. PhD thesis, University of Joensuu

  • Klecková J (2009) Important nonverbal attributes for spontaneous speech recognition. In: ICONS ’09: proceedings of the 2009 fourth international conference on systems. IEEE Computer Society, Washington, DC, USA, pp 13–16

  • Lee J, Marsella S (2006) Nonverbal behavior generator for embodied conversational agents. In: Proceedings of the 6th international conference on intelligent virtual agents. Springer, Marina del Rey, CA, pp 243–255

  • Lin Y, Le Z, Park K, Makedon F (2009) Dynamic information fusion and evidence equilibrium. In: IC-AI. CSREA Press, pp 769–775

  • Lin Y, Le Z, Becker E, Makedon F (2010) Acoustical implicit communication in human-robot interaction. In: Proceedings of the 3nd international conference on pervasive technologies related to assistive environments (PETRA’10), Samos, Greece, June 23–25

  • Lu H, Pan W, Lane ND, Choudhury T, Campbell AT (2009) Soundsense: scalable sound sensing for people-centric applications on mobile phones. In: MobiSys ’09: proceedings of the 7th international conference on mobile systems, applications, and services. ACM, New York, NY, USA, pp 165–178

  • Martin A, Mauuary L (2006) Robust speech/non-speech detection based on LDA-derived parameter and voicing parameter for speech recognition in noisy environments. Speech Commun 48(2): 191–206

    Article  Google Scholar 

  • Picard RA (1997) Affective computing. The MIT Press, Cambridge

    Google Scholar 

  • Portelo J, Bugalho M, Trancoso I, Neto J, Abad A, Serralheiro A (2009) Non-speech audio event detection. In: ICASSP ’09: proceedings of the 2009 IEEE international conference on acoustics, speech and signal processing. IEEE Computer Society, Washington, DC, USA, pp 1973–1976

  • Poyatos F (2002) Nonverbal communication across disciplines, vol 2: paralanguage, kinesics, silence, personal and environmental interaction. Benjamins, Amsterdam

    Google Scholar 

  • Rani P, Sarkar N (2004) Emotion-sensitive robots—a new paradigm for humanrobot interaction. In: IEEE-RAS/RSJ international conference on hu-manoid robots, pp 10–12

  • Rani P, Sarkar N, Smith C, Adams J (2003) Affective communication for implicit human-machine interaction. In: IEEE international conference on systems, man and cybernetics, vol 5. pp 4896– 4903

  • Razak AA, Yusof MHM, Komiya R (2003) Towards automatic recognition of emotion in speech. In: Proceedings of IEEE international symposium on signal processing and information technology. pp 548–551

  • Reynolds DA, Rose RC (1995) Robust text-independent speaker identification using gaussian mixture speaker models. IEEE Trans Speech Audio Process 3(1): 72–83

    Article  Google Scholar 

  • Ryokai K, Vaucelle C, Cassell J (2002) Literacy learning by storytelling with a virtual peer. In: Proceedings of computer support for collaborative learning

  • Schuller B, Reiter S, Müller R, Al-Hames M, Lang MK, Rigoll G (2005) Speaker independent speech emotion recognition by ensemble classification. In: ICME, IEEE. pp 864–867

  • Shin SH, Hashimoto T, Hatano S (2009) Automatic detection system for cough sounds as a symptom of abnormal health condition. IEEE Trans Inf Technol Biomed 13(4): 486–493

    Article  Google Scholar 

  • Shriberg E (2007) Higher-level features in speaker recognition. pp 241–259

  • Steels L (2001) Social learning and verbal communication with humanoid robots. In: Proceedings of the IEEE-RAS international conference on humanoid robots. IEEE, New York, NY, USA, pp 335–342

  • Støy K (2001) Using situated communication in distributed autonomous mobile robotics. In: SCAI. pp 44–52

  • Sutton RS, Barto AG (1998) Reinforcement learning: an introduction. The MIT Press, Cambridge

    Google Scholar 

  • Valin J, Michaud F, Rouat J, Ltourneau D (2003) Robust sound source localization using a microphone array on a mobile robot. In: Proceedings international conference on intelligent robots and systems. pp 1228–1233

  • Vogt T, André E, Wagner J (2008) Automatic recognition of emotions from speech: a review of the literature and recommendations for practical realisation. pp 75–91

  • Walther BJ, Tidwell CL (1995) Nonverbal cues in computer-mediated communication, and the effect of chronemics on relational communication. J Organ Comput 5(4): 355–378

    Google Scholar 

  • Woodburn R, Procter R, Arnott LJ, Newell FA (1991) A study of conversational turn-taking in a communication aid for the disabled. In: Proceedings of the HCI’91 conference on people and computers VI, applications. pp 359–371

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yong Lin.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lin, Y., Makedon, F. Nonverbal acoustic communication in human-computer interaction. Artif Intell Rev 35, 319–338 (2011). https://doi.org/10.1007/s10462-010-9196-4

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10462-010-9196-4

Keywords

Navigation