AusKidTalk: An Auditory-Visual Corpus of 3- to 12-Year-Old Australian Children’s Speech

Ahmed, Beena; Ballard, Kirrie J.; Burnham, Denis; Sirojan, Tharmakulasingam; Mehmood, Hadi; Estival, Dominique; Baker, Elise; Cox, Felicity; Arciuli, Joanne; Benders, Titia; Demuth, Katherine; Kelly, Barbara; Diskin-Holdaway, Chloé; Shahin, Mostafa; Sethu, Vidhyasaharan; Epps, Julien; Lee, Chwee Beng; Ambikairajah, Eliathamby

doi:10.21437/Interspeech.2021-2000

AusKidTalk: An Auditory-Visual Corpus of 3- to 12-Year-Old Australian Children’s Speech

Beena Ahmed, Kirrie J. Ballard, Denis Burnham, Tharmakulasingam Sirojan, Hadi Mehmood, Dominique Estival, Elise Baker, Felicity Cox, Joanne Arciuli, Titia Benders, Katherine Demuth, Barbara Kelly, Chloé Diskin-Holdaway, Mostafa Shahin, Vidhyasaharan Sethu, Julien Epps, Chwee Beng Lee, Eliathamby Ambikairajah

Here we present AusKidTalk [1], an audio-visual (AV) corpus of Australian children’s speech collected to facilitate the development of speech based technological solutions for children. It builds upon the technology and expertise developed through the collection of an earlier corpus of Australian adult speech, AusTalk [2,3]. This multi-site initiative was established to remedy the dire shortage of children’s speech corpora in Australia and around the world that are sufficiently sized to train accurate automated speech processing tools for children. We are collecting ~600 hours of speech from children aged 3–12 years that includes single word and sentence productions as well as narrative and emotional speech. In this paper, we discuss the key requirements for AusKidTalk and how we designed the recording setup and protocol to meet them. We also discuss key findings from our feasibility study of the recording protocol, recording tools, and user interface.

doi: 10.21437/Interspeech.2021-2000

Cite as: Ahmed, B., Ballard, K.J., Burnham, D., Sirojan, T., Mehmood, H., Estival, D., Baker, E., Cox, F., Arciuli, J., Benders, T., Demuth, K., Kelly, B., Diskin-Holdaway, C., Shahin, M., Sethu, V., Epps, J., Lee, C.B., Ambikairajah, E. (2021) AusKidTalk: An Auditory-Visual Corpus of 3- to 12-Year-Old Australian Children’s Speech. Proc. Interspeech 2021, 3680-3684, doi: 10.21437/Interspeech.2021-2000

@inproceedings{ahmed21_interspeech,
  author={Beena Ahmed and Kirrie J. Ballard and Denis Burnham and Tharmakulasingam Sirojan and Hadi Mehmood and Dominique Estival and Elise Baker and Felicity Cox and Joanne Arciuli and Titia Benders and Katherine Demuth and Barbara Kelly and Chloé Diskin-Holdaway and Mostafa Shahin and Vidhyasaharan Sethu and Julien Epps and Chwee Beng Lee and Eliathamby Ambikairajah},
  title={{AusKidTalk: An Auditory-Visual Corpus of 3- to 12-Year-Old Australian Children’s Speech}},
  year=2021,
  booktitle={Proc. Interspeech 2021},
  pages={3680--3684},
  doi={10.21437/Interspeech.2021-2000}
}