ISCA Archive Interspeech 2005
ISCA Archive Interspeech 2005

TBALL data collection: the making of a young children's speech corpus

Abe Kazemzadeh, Hong You, Markus Iseli, Barbara Jones, Xiaodong Cui, Margaret Heritage, Patti Price, Elaine Anderson, Shrikanth Narayanan, Abeer Alwan

In this paper we describe the data collection for the TBALL project (Technology Based Assessment of Language and Literacy) and report the results of our efforts. We focus on aspects of our corpus that distinguish it from currently available corpora. The speakers are children (grades K-4), largely non-native speakers of English, and from diverse socio-economic backgrounds, who are learning to read. We also describe how we adapted our methodology to accommodate these differences: our recording setup, data collection methodology, and transcription scheme. We also discuss the task this corpus was designed to serve and our research approach.


doi: 10.21437/Interspeech.2005-462

Cite as: Kazemzadeh, A., You, H., Iseli, M., Jones, B., Cui, X., Heritage, M., Price, P., Anderson, E., Narayanan, S., Alwan, A. (2005) TBALL data collection: the making of a young children's speech corpus. Proc. Interspeech 2005, 1581-1584, doi: 10.21437/Interspeech.2005-462

@inproceedings{kazemzadeh05_interspeech,
  author={Abe Kazemzadeh and Hong You and Markus Iseli and Barbara Jones and Xiaodong Cui and Margaret Heritage and Patti Price and Elaine Anderson and Shrikanth Narayanan and Abeer Alwan},
  title={{TBALL data collection: the making of a young children's speech corpus}},
  year=2005,
  booktitle={Proc. Interspeech 2005},
  pages={1581--1584},
  doi={10.21437/Interspeech.2005-462}
}