skip to main content
10.1145/3411764.3445490acmconferencesArticle/Chapter ViewAbstractPublication PageschiConference Proceedingsconference-collections
research-article

PTeacher: a Computer-Aided Personalized Pronunciation Training System with Exaggerated Audio-Visual Corrective Feedback

Authors Info & Claims
Published:07 May 2021Publication History

ABSTRACT

Second language (L2) English learners often find it difficult to improve their pronunciations due to the lack of expressive and personalized corrective feedback. In this paper, we present Pronunciation Teacher (PTeacher), a Computer-Aided Pronunciation Training (CAPT) system that provides personalized exaggerated audio-visual corrective feedback for mispronunciations. Though the effectiveness of exaggerated feedback has been demonstrated, it is still unclear how to define the appropriate degrees of exaggeration when interacting with individual learners. To fill in this gap, we interview 100 L2 English learners and 22 professional native teachers to understand their needs and experiences. Three critical metrics are proposed for both learners and teachers to identify the best exaggeration levels in both audio and visual modalities. Additionally, we incorporate the personalized dynamic feedback mechanism given the English proficiency of learners. Based on the obtained insights, a comprehensive interactive pronunciation training course is designed to help L2 learners rectify mispronunciations in a more perceptible, understandable, and discriminative manner. Extensive user studies demonstrate that our system significantly promotes the learners’ learning efficiency.

Skip Supplemental Material Section

Supplemental Material

3411764.3445490_videofigure.mp4

mp4

42.8 MB

References

  1. Najwa Alghamdi, Steve Maddock, Jon Barker, and Guy J Brown. 2017. The impact of automatic exaggeration of the visual articulatory features of a talker on the intelligibility of spectrally distorted speech. Speech Communication 95(2017), 127–136.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Pierre Badin, Atef Ben Youssef, Gérard Bailly, Frédéric Elisei, and Thomas Hueber. 2010. Visual articulatory feedback for phonetic correction in second language learning. In Second Language Studies: Acquisition, Learning, Education and Technology.Google ScholarGoogle Scholar
  3. Heather Bliss, Jennifer Abel, and Bryan Gick. 2018. Computer-assisted visual articulation feedback in L2 pronunciation instruction: A review. Journal of Second Language Pronunciation 4, 1 (2018), 129–153.Google ScholarGoogle ScholarCross RefCross Ref
  4. Ann R Bradlow, David B Pisoni, Reiko Akahane-Yamada, and Yoh’ichi Tohkura. 1997. Training Japanese listeners to identify English/r/and/l: IV. Some effects of perceptual learning on speech production. The Journal of the Acoustical Society of America 101, 4 (1997), 2299–2310.Google ScholarGoogle ScholarCross RefCross Ref
  5. Catherine P Browman and Louis Goldstein. 1992. Articulatory phonology: An overview. Phonetica 49, 3-4 (1992), 155–180.Google ScholarGoogle ScholarCross RefCross Ref
  6. Matthew I Brown and Avi E Cieplinski. 2020. Device, method, and graphical user interface for providing audiovisual feedback. US Patent 10,599,394.Google ScholarGoogle Scholar
  7. Yaohua Bu, Jia Jia, Xiang Li, Suping Zhou, and Xiaobo Lu. 2018. IcooBook: when the picture book for children encounters aesthetics of interaction. In Proceedings of the 26th ACM international conference on Multimedia. 1260–1262.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Yaohua Bu, Weijun Li, Tianyi Ma, Shengqi Chen, Jia Jia, Kun Li, and Xiaobo Lu. 2020. Visual-speech Synthesis of Exaggerated Corrective Feedback. In Proceedings of the 28th ACM International Conference on Multimedia. 4521–4523.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Eva Cerviño-Povedano and Joan C Mora. 2010. Investigating Catalan learners of English over-reliance on duration: Vowel cue weighting and phonological short-term memory. Achievements and perspectives in the acquisition of second language speech: New Sounds (2010), 53–64.Google ScholarGoogle Scholar
  10. Pierre Chalfoun and Claude Frasson. 2011. Subliminal cues while teaching: HCI technique for enhanced learning. Advances in Human-Computer Interaction 2011 (2011).Google ScholarGoogle Scholar
  11. Bay-Wei Chang and David Ungar. 1993. Animation: from cartoons to the user interface. In Proceedings of the 6th annual ACM symposium on User interface software and technology. 45–55.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Tsuhan Chen and Ram R Rao. 1998. Audio-visual integration in multimodal communication. Proc. IEEE 86, 5 (1998), 837–852.Google ScholarGoogle ScholarCross RefCross Ref
  13. Bing Cheng, Xiaojuan Zhang, Siying Fan, and Yang Zhang. 2019. The role of temporal acoustic exaggeration in high variability phonetic training: A behavioral and ERP study. Frontiers in psychology 10 (2019), 1178.Google ScholarGoogle Scholar
  14. Bing Cheng, Xiaojuan Zhang, and Yang Zhang. 2019. Temporal exaggeration facilitates second language phonetic training: The case of syllable-final nasal contrast. The Journal of the Acoustical Society of America 146, 4 (2019), 2844–2844.Google ScholarGoogle ScholarCross RefCross Ref
  15. Laura Colantoni, Jeffrey Steele, Paola Escudero, and Paola Rocío Escudero Neyra. 2015. Second language speech. Cambridge University Press.Google ScholarGoogle Scholar
  16. Juliet Corbin and Anselm Strauss. 2014. Basics of qualitative research: Techniques and procedures for developing grounded theory. Sage publications.Google ScholarGoogle Scholar
  17. Nuria Calvo Cortés. 2005. Negative language transfer when learning Spanish as a foreign language. Interlingüística16 (2005), 237–248.Google ScholarGoogle Scholar
  18. British Council. 2013. The English Effect. Retrieved March 22(2013), 2015.Google ScholarGoogle Scholar
  19. David Crystal. 2011. A dictionary of linguistics and phonetics. Vol. 30. John Wiley & Sons.Google ScholarGoogle Scholar
  20. Tracey M Derwing and Murray J Munro. 2005. Second language accent and pronunciation teaching: A research-based approach. TESOL quarterly 39, 3 (2005), 379–397.Google ScholarGoogle Scholar
  21. Tracey M Derwing and Marian J Rossiter. 2002. ESL learners’ perceptions of their pronunciation needs and strategies. System 30, 2 (2002), 155–166.Google ScholarGoogle ScholarCross RefCross Ref
  22. Paola Escudero. 2001. The role of the input in the development of L1 and L2 sound contrasts: language-specific cue weighting for vowels. In Proceedings of the 25th annual Boston University conference on language development, Vol. 1. Citeseer, 250–261.Google ScholarGoogle Scholar
  23. Paola Rocío Escudero Neyra. 2005. Linguistic perception and second language acquisition: explaining the attainment of optimal phonological categorization. Ph.D. Dissertation. Utrecht University & LOT.Google ScholarGoogle Scholar
  24. Tony Ezzat and Tomaso Poggio. 2000. Visual speech synthesis by morphing visemes. International Journal of Computer Vision 38, 1 (2000), 45–57.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Christina Garcia, Mark Kolat, and Terrell A Morgan. 2018. SELF-CORRECTION OF SECOND-LANGUAGE PRONUNCIATION VIA ONLINE, REAL-TIME, VISUAL FEEDBACK. In PRONUNCIATION IN SECOND LANGUAGE LEARNING AND TEACHING CONFERENCE (ISSN 2380-9566). 54.Google ScholarGoogle Scholar
  26. Patrick H Geoghegan, C Spence, Wei H Ho, X Lu, M Jermy, P Hunter, and J Cater. 2012. Stereoscopic PIV measurement of airflow in human speech during pronunciation of fricatives. In 16th International Symposium of Laser Techniques to Fluid Mechanics, Lisbon, Portugal, 9th-12th July.Google ScholarGoogle Scholar
  27. Ewa M Golonka, Anita R Bowles, Victor M Frank, Dorna L Richardson, and Suzanne Freynik. 2014. Technologies for foreign language learning: a review of technology types and their effectiveness. Computer assisted language learning 27, 1 (2014), 70–105.Google ScholarGoogle Scholar
  28. Antti Granqvist, Tapio Takala, Jari Takatalo, and Perttu Hämäläinen. 2018. Exaggeration of Avatar Flexibility in Virtual Reality. In Proceedings of the 2018 Annual Symposium on Computer-Human Interaction in Play. 201–209.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Joshua Hailpern, Karrie Karahalios, and James Halle. 2009. Creating a spoken impact: encouraging vocalization through audio visual feedback in children with ASD. In Proceedings of the SIGCHI conference on human factors in computing systems. 453–462.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Morris Halle, Bert Vaux, and Andrew Wolfe. 2000. On feature spreading and the representation of place of articulation. Linguistic inquiry 31, 3 (2000), 387–444.Google ScholarGoogle Scholar
  31. CC Hsu. [n.d.]. Python-wrapper-for-world-vocoder.Google ScholarGoogle Scholar
  32. Philip Hubbard. 2002. Interactive participatory dramas for language learning. Simulation & Gaming 33, 2 (2002), 210–216.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Yurie Iribe, Silasak Manosavanh, Kouichi Katsurada, Ryoko Hayashi, Chunyue Zhu, and Tsuneo Nitta. 2011. Generating animated pronunciation from speech through articulatory feature extraction. In Twelfth Annual Conference of the International Speech Communication Association.Google ScholarGoogle ScholarCross RefCross Ref
  34. D Kalikow and J Swets. 1972. Experiments with computer-controlled displays in second-language learning. IEEE Transactions on Audio and Electroacoustics 20, 1(1972), 23–28.Google ScholarGoogle ScholarCross RefCross Ref
  35. Natalia Kartushina and Ulrich H Frauenfelder. 2014. On the effects of L2 perception and of individual differences in L1 production on L2 pronunciation. Frontiers in psychology 5 (2014), 1246.Google ScholarGoogle Scholar
  36. Natalia Kartushina, Alexis Hervais-Adelman, Ulrich Hans Frauenfelder, and Narly Golestani. 2015. The effect of phonetic production training with visual feedback on the perception and production of foreign speech sounds. The journal of the acoustical society of America 138, 2 (2015), 817–832.Google ScholarGoogle Scholar
  37. Tatsuya Kawahara, Masatake Dantsuji, and Yasushi Tsubota. 2004. Practical use of English pronunciation system for Japanese students in the CALL classroom. In Eighth International Conference on Spoken Language Processing.Google ScholarGoogle ScholarCross RefCross Ref
  38. Gerald Kelly. 2006. How To Teach Pronunciation (With Cd). Pearson Education India.Google ScholarGoogle Scholar
  39. P Khul, K Williams, F Lacerda, and K Lindblom Stevens. [n.d.]. B.(1992). Linguistic Experience Alters Phonetic Perception in Infants by 6 Months of Age. Science 255([n. d.]).Google ScholarGoogle Scholar
  40. AJ King and AR Palmer. 1985. Integration of visual and auditory information in bimodal neurones in the guinea-pig superior colliculus. Experimental brain research 60, 3 (1985), 492–500.Google ScholarGoogle Scholar
  41. Valeri Aleksandrovich Kozhevnikov and Liudmila Andreevna Chistovich. 1967. Speech: articulation and perception. Vol. 30. US Department of Commerce, Clearinghouse for Federal Scientific and ….Google ScholarGoogle Scholar
  42. John Lasseter. 1987. Principles of traditional animation applied to 3D computer animation. In Proceedings of the 14th annual conference on Computer graphics and interactive techniques. 35–44.Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Andrew H Lee and Roy Lyster. 2016. The effects of corrective feedback on instructed L2 speech perception. Studies in Second Language Acquisition 38, 1 (2016), 35.Google ScholarGoogle ScholarCross RefCross Ref
  44. Bradford Lee, Luke Plonsky, and Kazuya Saito. 2020. The effects of perception-vs. production-based pronunciation instruction. System 88(2020), 102185.Google ScholarGoogle ScholarCross RefCross Ref
  45. Wai-Kim Leung, Xunying Liu, and Helen Meng. 2019. CNN-RNN-CTC based end-to-end mispronunciation detection and diagnosis. In ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 8132–8136.Google ScholarGoogle ScholarCross RefCross Ref
  46. Wai-Kim Leung, Ka-Wa Yuen, Ka-Ho Wong, and Helen Meng. 2013. Development of text-to-audiovisual speech synthesis to support interactive language learning on a mobile device. In 2013 IEEE 4th International Conference on Cognitive Infocommunications (CogInfoCom). IEEE, 583–588.Google ScholarGoogle ScholarCross RefCross Ref
  47. Kun Li, Jing Li, Yufang Song, and Hewei Fu. 2015. Rating Algorithm for Pronunciation of English Based on Audio Feature Pattern Matching. In MATEC Web of Conferences, Vol. 22. EDP Sciences, 01032.Google ScholarGoogle Scholar
  48. Kun Li, Xiaojun Qian, Shiyin Kang, Pengfei Liu, and Helen Meng. 2015. Integrating acoustic and state-transition models for free phone recognition in L2 English speech using multi-distribution deep neural networks.. In SLaTE. 119–124.Google ScholarGoogle Scholar
  49. Kun Li, Xiaojun Qian, and Helen Meng. 2016. Mispronunciation detection and diagnosis in l2 english speech using multidistribution deep neural networks. IEEE/ACM Transactions on Audio, Speech, and Language Processing 25, 1(2016), 193–207.Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Alvin M Liberman, Katherine Safford Harris, Howard S Hoffman, and Belver C Griffith. 1957. The discrimination of speech sounds within and across phoneme boundaries.Journal of experimental psychology 54, 5 (1957), 358.Google ScholarGoogle Scholar
  51. Patsy M Lightbown and Nina Spada. 2000. Do they know what they’re doing? L2 learners’ awareness of L1 influence. Language Awareness 9, 4 (2000), 198–217.Google ScholarGoogle ScholarCross RefCross Ref
  52. Guanhong Liu, Xianghua Ding, Chun Yu, Lan Gao, Xingyu Chi, and Yuanchun Shi. 2019. ” I Bought This for Me to Look More Ordinary” A Study of Blind People Doing Online Shopping. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. 1–11.Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Pengfei Liu, Ka-Wa Yuen, Wai-Kim Leung, and Helen Meng. 2012. menunciate: Development of a computer-aided pronunciation training system on a cross-platform framework for mobile, speech-enabled application development. In 2012 8th International Symposium on Chinese Spoken Language Processing. IEEE, 170–173.Google ScholarGoogle ScholarCross RefCross Ref
  54. Jingli Lu, Ruili Wang, and Liyanage C De Silva. 2012. Automatic stress exaggeration by prosody modification to assist language learners perceive sentence stress. International journal of speech technology 15, 2 (2012), 87–98.Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Jingli Lu, Ruili Wang, Liyanage C De Silva, Yang Gao, and Jia Liu. 2010. CASTLE: a computer-assisted stress teaching and learning environment for learners of English as a second language. In Eleventh Annual Conference of the International Speech Communication Association.Google ScholarGoogle ScholarCross RefCross Ref
  56. Michael McAuliffe, Michaela Socolof, Sarah Mihuc, Michael Wagner, and Morgan Sonderegger. 2017. Montreal Forced Aligner: Trainable Text-Speech Alignment Using Kaldi.. In Interspeech, Vol. 2017. 498–502.Google ScholarGoogle Scholar
  57. Fanbo Meng, Helen Meng, Zhiyong Wu, and Lianhong Cai. 2010. Synthesizing expressive speech to convey focus using a perturbation model for computer-aided pronunciation training. In Second Language Studies: Acquisition, Learning, Education and Technology.Google ScholarGoogle Scholar
  58. Fanbo Meng, Zhiyong Wu, Jia Jia, Helen Meng, and Lianhong Cai. 2014. Synthesizing English emphatic speech for multimodal corrective feedback in computer-aided pronunciation training. Multimedia tools and applications 73, 1 (2014), 463–489.Google ScholarGoogle Scholar
  59. Fanbo Meng, Zhiyong Wu, Helen Meng, Jia Jia, and Lianhong Cai. 2012. Hierarchical English emphatic speech synthesis based on HMM with limited training data. In Thirteenth Annual Conference of the International Speech Communication Association.Google ScholarGoogle ScholarCross RefCross Ref
  60. Helen Meng, Yuen Yee Lo, Lan Wang, and Wing Yiu Lau. 2007. Deriving salient learners’ mispronunciations from cross-language phonological comparisons. In 2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU). IEEE, 437–442.Google ScholarGoogle ScholarCross RefCross Ref
  61. Richard I Miller. 1990. Major American Higher Education Issues and Challenges in the 1990s. Higher Education Policy Series 9.ERIC.Google ScholarGoogle Scholar
  62. Joan C Mora and Isabelle Darcy. 2017. The relationship between cognitive control and pronunciation in a second language. Second language pronunciation assessment(2017), 95.Google ScholarGoogle Scholar
  63. Murray J Munro, Tracey M Derwing, and James E Flege. 1999. Canadians in Alabama: A perceptual study of dialect acquisition in adults. Journal of Phonetics 27, 4 (1999), 385–403.Google ScholarGoogle ScholarCross RefCross Ref
  64. Ambra Neri, Catia Cucchiarini, and Helmer Strik. 2006. ASR corrective feedback on pronunciation: Does it really work?(2006).Google ScholarGoogle Scholar
  65. Ambra Neri, Catia Cucchiarini, Helmer Strik, and Lou Boves. 2002. The pedagogy-technology interface in computer assisted pronunciation training. Computer assisted language learning 15, 5 (2002), 441–467.Google ScholarGoogle Scholar
  66. Ambra Neri, Ornella Mich, Matteo Gerosa, and Diego Giuliani. 2008. The effectiveness of computer assisted pronunciation training for foreign language learning by children. Computer Assisted Language Learning 21, 5 (2008), 393–408.Google ScholarGoogle ScholarCross RefCross Ref
  67. Yishuang Ning, Zhiyong Wu, Jia Jia, Fanbo Meng, Helen Meng, and Lianhong Cai. 2015. HMM-based emphatic speech synthesis for corrective feedback in computer-aided pronunciation training. In 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 4934–4938.Google ScholarGoogle ScholarCross RefCross Ref
  68. Richard Ogden. 2017. Introduction to English Phonetics. Edinburgh university press.Google ScholarGoogle Scholar
  69. Mirian Oliveira, Claudia Bitencourt, Eduardo Teixeira, and Ana Clarissa Santos. 2013. Thematic content analysis: Is there a difference between the support provided by the MAXQDA® and NVivo® software packages. In Proceedings of the 12th European Conference on Research Methods for Business and Management Studies. 304–314.Google ScholarGoogle Scholar
  70. Marta Ortega and Valerie Hazan. 1999. Enhancing acoustic cues to aid L2 speech perception. In Proceedings of the International Congress of Phonetics Sciences. 117–120.Google ScholarGoogle Scholar
  71. Martha C Pennington. 1999. Computer-aided pronunciation pedagogy: Promise, limitations, directions. Computer Assisted Language Learning 12, 5 (1999), 427–440.Google ScholarGoogle ScholarCross RefCross Ref
  72. Janet Breckenridge Pierrehumbert. 1980. The phonology and phonetics of English intonation. Ph.D. Dissertation. Massachusetts Institute of Technology.Google ScholarGoogle Scholar
  73. Linda Polka and Janet F Werker. 1994. Developmental changes in perception of nonnative vowel contrasts.Journal of Experimental Psychology: Human perception and performance 20, 2(1994), 421.Google ScholarGoogle Scholar
  74. Yi Ren, Yangjun Ruan, Xu Tan, Tao Qin, Sheng Zhao, Zhou Zhao, and Tie-Yan Liu. 2019. Fastspeech: Fast, robust and controllable text to speech. In Advances in Neural Information Processing Systems. 3171–3180.Google ScholarGoogle Scholar
  75. Tiago Ribeiro and Ana Paiva. 2012. The illusion of robotic life: principles and practices of animation for robots. In Proceedings of the seventh annual ACM/IEEE international conference on Human-Robot Interaction. 383–390.Google ScholarGoogle ScholarDigital LibraryDigital Library
  76. Ellen Ricard. 1986. Beyond Fossilization: A Course in Strategies and Techniques in Pronunciation for Advanced Adult Learners.TESL Canada Journal (1986), 243–253.Google ScholarGoogle Scholar
  77. Sean Robertson, Cosmin Munteanu, and Gerald Penn. 2018. Designing Pronunciation Learning Tools: The Case for Interactivity against Over-Engineering. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. 1–13.Google ScholarGoogle ScholarDigital LibraryDigital Library
  78. Pamela Rogerson-Revell. 2011. English phonology and pronunciation teaching. Bloomsbury Publishing.Google ScholarGoogle Scholar
  79. Winifred Strange. 1995. Speech perception and linguistic experience: Theoretical and methodological issues.Google ScholarGoogle Scholar
  80. Winifred Strange, Valerie L Shafer, 2008. Speech perception in second language learners: The re-education of selective perception. Phonology and second language acquisition 36 (2008), 153–192.Google ScholarGoogle Scholar
  81. Frank Thomas, Ollie Johnston, and Frank Thomas. 1995. The illusion of life: Disney animation. Hyperion New York.Google ScholarGoogle Scholar
  82. Ingo R Titze and Daniel W Martin. 1998. Principles of voice production.Google ScholarGoogle Scholar
  83. Nikolai Sergeevich Trubetzkoy. 1969. Principles of phonology.(1969).Google ScholarGoogle Scholar
  84. Ganna Veselovska. 2016. Teaching elements of English RP connected speech and CALL: Phonemic assimilation. Education and Information Technologies 21, 5 (2016), 1387–1400.Google ScholarGoogle ScholarCross RefCross Ref
  85. Amy B Wohlert and Vicki L Hammen. 2000. Lip muscle activity related to speech rate and loudness. Journal of Speech, Language, and Hearing Research 43, 5 (2000), 1229–1239.Google ScholarGoogle ScholarCross RefCross Ref
  86. Ka-Ho Wong, Wai-Kim Leung, Wai-Kit Lo, and Helen Meng. 2010. Development of an articulatory visual-speech synthesizer to support language learning. In 2010 7th International Symposium on Chinese Spoken Language Processing. IEEE, 139–143.Google ScholarGoogle ScholarCross RefCross Ref
  87. Ka-Wa Yuen, Wai-Kim Leung, Peng-fei Liu, Ka-Ho Wong, Xiao-jun Qian, Wai-Kit Lo, and Helen Meng. 2011. Enunciate: An internet-accessible computer-aided pronunciation training system and related user evaluations. In 2011 International Conference on Speech Database and Assessments (Oriental COCOSDA). IEEE, 85–90.Google ScholarGoogle ScholarCross RefCross Ref
  88. Fan-Gang Zeng, Kristina M Martino, Fred H Linthicum, and Sigfrid D Soli. 2000. Auditory perception in vestibular neurectomy subjects. Hearing research 142, 1-2 (2000), 102–112.Google ScholarGoogle Scholar
  89. Junhong Zhao, Hua Yuan, Wai-Kim Leung, Helen Meng, Jia Liu, and Shanhong Xia. 2013. Audiovisual synthesis of exaggerated speech for corrective feedback in computer-assisted pronunciation training. In 2013 IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, 8218–8222.Google ScholarGoogle ScholarCross RefCross Ref
  90. Hang Zhou, Yu Liu, Ziwei Liu, Ping Luo, and Xiaogang Wang. 2019. Talking Face Generation by Adversarially Disentangled Audio-Visual Representation. In AAAI Conference on Artificial Intelligence (AAAI).Google ScholarGoogle ScholarDigital LibraryDigital Library
  91. Yang Zhou, Xintong Han, Eli Shechtman, Jose Echevarria, Evangelos Kalogerakis, and Dingzeyu Li. 2020. MakeItTalk: Speaker-Aware Talking-Head Animation. ACM Transactions on Graphics 39, 6 (2020).Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. PTeacher: a Computer-Aided Personalized Pronunciation Training System with Exaggerated Audio-Visual Corrective Feedback
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          CHI '21: Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems
          May 2021
          10862 pages
          ISBN:9781450380966
          DOI:10.1145/3411764

          Copyright © 2021 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 7 May 2021

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Research
          • Refereed limited

          Acceptance Rates

          Overall Acceptance Rate6,199of26,314submissions,24%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format .

        View HTML Format