ABSTRACT
Head-mounted displays can provide private and glanceable speech and sound feedback to deaf and hard of hearing people, yet prior systems have largely focused on speech transcription. We introduce HoloSound, a HoloLens-based augmented reality (AR) prototype that uses deep learning to classify and visualize sound identity and location in addition to providing speech transcription. This poster paper presents a working proof-of-concept prototype, and discusses future opportunities for advancing AR-based sound awareness.
Supplemental Material
- Manish Sharma, Mallikarjuna Rao Abhijit Jana. HoloLens Blueprints - Google Books. Retrieved June 7, 2020 from https://books.google.com/books?id=_Hc5DwAAQBAJ&printsec=frontcover&source=gbs_ge_summary_r&cad=0#v=onepage&q&f=falseGoogle Scholar
- Edward T. Auer. 1998. Temporal and spatio-temporal vibrotactile displays for voice fundamental frequency: An initial evaluation of a new vibrotactile speech perception aid with normal-hearing and hearing-impaired individuals. The Journal of the Acoustical Society of America 104, 4: 2477. Retrieved from http://scitation.aip.org/content/asa/journal/jasa/104/4/10.1121/1.423909Google ScholarCross Ref
- Danielle Bragg, Nicholas Huynh, and Richard E. Ladner. 2016. A Personalizable Mobile Sound Detector App Design for Deaf and Hard-of-Hearing Users. In Proceedings of the 18th International ACM SIGACCESS Conference on Computers and Accessibility, 3–13.Google Scholar
- Leah Findlater, Bonnie Chinh, Dhruv Jain, Jon Froehlich, Raja Kushalnagar, and Angela Carey Lin. 2019. Deaf and Hard-of-hearing Individuals’ Preferences for Wearable and Mobile Sound Awareness Technologies. In SIGCHI Conference on Human Factors in Computing Systems (CHI). In Submission.Google Scholar
- Eduardo Fonseca, Jordi Pons Puig, Xavier Favory, Frederic Font Corbera, Dmitry Bogdanov, Andres Ferraro, Sergio Oramas, Alastair Porter, and Xavier Serra. 2017. Freesound datasets: a platform for the creation of open audio datasets. In Hu X, Cunningham SJ, Turnbull D, Duan Z, editors. Proceedings of the 18th ISMIR Conference; 2017 oct 23-27; Suzhou, China.[Canada]: International Society for Music Information Retrieval; 2017. p. 486-93.Google Scholar
- Abraham Glasser, Kesavan Kushalnagar, and Raja Kushalnagar. 2017. Deaf, hard of hearing, and hearing perspectives on using automatic speech recognition in conversation. In Proceedings of the 19th International ACM SIGACCESS Conference on Computers and Accessibility, 427–432. https://doi.org/10.1145/3132525.3134781Google ScholarDigital Library
- Steven Goodman, Susanne Kirchner, Rose Guttman, Dhruv Jain, Jon Froehlich, and Leah Findlater. Evaluating Smartwatch-based Sound Feedback for Deaf and Hard-of-hearing Users Across Contexts. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, 1–13.Google Scholar
- Benjamin M Gorman. 2014. VisAural: a wearable sound-localisation device for people with impaired hearing. In Proceedings of the 16th international ACM SIGACCESS conference on Computers & accessibility, 337–338. https://doi.org/10.1145/2661334.2661410Google ScholarDigital Library
- François Grondin and François Michaud. 2019. Lightweight and optimized sound source localization and tracking methods for open and closed microphone array configurations. Robotics and Autonomous Systems 113: 63–80. Retrieved from =Google ScholarDigital Library
- Shawn Hershey, Sourish Chaudhuri, Daniel P W Ellis, Jort F Gemmeke, Aren Jansen, R Channing Moore, Manoj Plakal, Devin Platt, Rif A Saurous, Bryan Seybold, and others. 2017. CNN architectures for large-scale audio classification. In2017 IEEE international conference on acoustics, speech and signal processing (ICASSP), 131–135.Google ScholarDigital Library
- Dhruv Jain, Bonnie Chinh, Leah Findlater, Raja Kushalnagar, and Jon Froehlich. 2018. Exploring Augmented Reality Approaches to Real-Time Captioning: A Preliminary Autoethnographic Study. In Proceedings of the 2018 ACM Conference Companion Publication on Designing Interactive Systems, 7–11.Google ScholarDigital Library
- Dhruv Jain, Audrey Desjardins, Leah Findlater, and Jon E Froehlich. 2019. Autoethnography of a Hard of Hearing Traveler. In The 21st International ACM SIGACCESS Conference on Computers and Accessibility, 236–248.Google ScholarDigital Library
- Dhruv Jain, Leah Findlater, Christian Volger, Dmitry Zotkin, Ramani Duraiswami, and Jon Froehlich. 2015. Head-Mounted Display Visualizations to Support Sound Awareness for the Deaf and Hard of Hearing. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems, 241–250.Google ScholarDigital Library
- Dhruv Jain, Rachel Franz, Leah Findlater, Jackson Cannon, Raja Kushalnagar, and Jon Froehlich. 2018. Towards Accessible Conversations in a Mobile Context for People who are Deaf and Hard of Hearing. In Proceedings of ACM ASSETS 2018, 12 pages.Google ScholarDigital Library
- Dhruv Jain, Angela Carey Lin, Marcus Amalachandran, Aileen Zeng, Rose Guttman, Leah Findlater, and Jon Froehlich. 2019. Exploring Sound Awareness in the Home for People who are Deaf or Hard of Hearing. In SIGCHI Conference on Human Factors in Computing Systems (CHI). In Submission.Google ScholarDigital Library
- Dhruv Jain, Kelly Mack, Akli Amrous, Matt Wright, Steven Goodman, Leah Findlater, and Jon E Froehlich. 2020. HomeSound: An Iterative Field Deployment of an In-Home Sound Awareness System for Deaf or Hard of Hearing Users. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (CHI ’20), 1–12. https://doi.org/10.1145/3313831.3376758Google ScholarDigital Library
- Dhruv Jain, Hung Ngo, Pratyush Patel, Steven Goodman, Leah Findlater, and Jon Froehlich. 2020. SoundWatch: Exploring Smartwatch-based Deep Learning Approaches to Support Sound Awareness for Deaf and Hard of Hearing Users. In ACM SIGACCESS conference on Computers and accessibility, 1–13.Google ScholarDigital Library
- Raja. S Kushalnagar, Walter S Lasecki, and Jeffrey P Bigham. 2014. Accessibility Evaluation of Classroom Captions. ACM Transactions on Accessible Computing 5, 3: 1–24. https://doi.org/10.1145/2543578Google ScholarDigital Library
- Yi-Hao Peng, Ming-Wei Hsu, Paul Taele, Ting-Yu Lin, Po-En Lai, Leon Hsu, Tzu-chuan Chen, Te-Yen Wu, Yu-An Chen, Hsien-Hui Tang, and Mike Y. Chen. 2018. SpeechBubbles: Enhancing Captioning Experiences for Deaf and Hard-of-Hearing People in Group Conversations. In SIGCHI Conference on Human Factors in Computing Systems (CHI), Paper No. 293.Google ScholarDigital Library
- ReSpeaker Mic Array v2.0 - Seeed Wiki. Retrieved June 7, 2020 from https://wiki.seeedstudio.com/ReSpeaker_Mic_Array_v2.0/Google Scholar
- Speech to Text | Microsoft Azure. Retrieved June 7, 2020 from https://azure.microsoft.com/en-us/services/cognitive-services/speech-to-text/Google Scholar
- BBC Sound Effects. Retrieved September 18, 2019 from http://bbcsfx.acropolis.org.uk/Google Scholar
- HoloLens (1st gen) hardware | Microsoft Docs. Retrieved June 7, 2020 from https://docs.microsoft.com/en-us/hololens/hololens1-hardwareGoogle Scholar
- Raspberry Pi 4. Retrieved June 7, 2020 from https://www.raspberrypi.org/products/raspberry-pi-4-model-b/Google Scholar
Index Terms
- HoloSound: Combining Speech and Sound Identification for Deaf or Hard of Hearing Users on a Head-mounted Display
Recommendations
Deaf and Hard-of-hearing Individuals' Preferences for Wearable and Mobile Sound Awareness Technologies
CHI '19: Proceedings of the 2019 CHI Conference on Human Factors in Computing SystemsTo investigate preferences for mobile and wearable sound awareness systems, we conducted an online survey with 201 DHH participants. The survey explores how demographic factors affect perceptions of sound awareness technologies, gauges interest in ...
Head-Mounted Display Visualizations to Support Sound Awareness for the Deaf and Hard of Hearing
CHI '15: Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing SystemsPersons with hearing loss use visual signals such as gestures and lip movement to interpret speech. While hearing aids and cochlear implants can improve sound recognition, they generally do not help the wearer localize sound necessary to leverage these ...
ProtoSound: A Personalized and Scalable Sound Recognition System for Deaf and Hard-of-Hearing Users
CHI '22: Proceedings of the 2022 CHI Conference on Human Factors in Computing SystemsRecent advances have enabled automatic sound recognition systems for deaf and hard of hearing (DHH) users on mobile devices. However, these tools use pre-trained, generic sound recognition models, which do not meet the diverse needs of DHH users. We ...
Comments