We investigate the number of speakers and the amount of data that is required for the development of useable speaker-independent speech-recognition systems in resource-scarce languages. Our experiments employ the Lwazi corpus, which contains speech in the eleven official languages of South Africa. We find that a surprisingly small number of speakers (fewer than 50) and around 10 to 20 hours of speech per language are sufficient for the purposes of acceptable phone-based recognition.
Bibliographic reference. Barnard, Etienne / Davel, Marelie / Heerden, Charl van (2009): "ASR corpus design for resource-scarce languages", In INTERSPEECH-2009, 2847-2850.