Here we present AusKidTalk [1], an audio-visual (AV) corpus of Australian children’s speech collected to facilitate the development of speech based technological solutions for children. It builds upon the technology and expertise developed through the collection of an earlier corpus of Australian adult speech, AusTalk [2,3]. This multi-site initiative was established to remedy the dire shortage of children’s speech corpora in Australia and around the world that are sufficiently sized to train accurate automated speech processing tools for children. We are collecting ~600 hours of speech from children aged 3–12 years that includes single word and sentence productions as well as narrative and emotional speech. In this paper, we discuss the key requirements for AusKidTalk and how we designed the recording setup and protocol to meet them. We also discuss key findings from our feasibility study of the recording protocol, recording tools, and user interface.
Cite as: Ahmed, B., Ballard, K.J., Burnham, D., Sirojan, T., Mehmood, H., Estival, D., Baker, E., Cox, F., Arciuli, J., Benders, T., Demuth, K., Kelly, B., Diskin-Holdaway, C., Shahin, M., Sethu, V., Epps, J., Lee, C.B., Ambikairajah, E. (2021) AusKidTalk: An Auditory-Visual Corpus of 3- to 12-Year-Old Australian Children’s Speech. Proc. Interspeech 2021, 3680-3684, doi: 10.21437/Interspeech.2021-2000
@inproceedings{ahmed21_interspeech, author={Beena Ahmed and Kirrie J. Ballard and Denis Burnham and Tharmakulasingam Sirojan and Hadi Mehmood and Dominique Estival and Elise Baker and Felicity Cox and Joanne Arciuli and Titia Benders and Katherine Demuth and Barbara Kelly and Chloé Diskin-Holdaway and Mostafa Shahin and Vidhyasaharan Sethu and Julien Epps and Chwee Beng Lee and Eliathamby Ambikairajah}, title={{AusKidTalk: An Auditory-Visual Corpus of 3- to 12-Year-Old Australian Children’s Speech}}, year=2021, booktitle={Proc. Interspeech 2021}, pages={3680--3684}, doi={10.21437/Interspeech.2021-2000} }