ISCA Archive Interspeech 2009
ISCA Archive Interspeech 2009

Basic speech recognition for spoken dialogues

Charl van Heerden, Etienne Barnard, Marelie Davel

Spoken dialogue systems (SDSs) have great potential for information access in the developing world. However, the realisation of that potential requires the solution of several challenging problems, including the development of sufficiently accurate speech recognisers for a diverse multitude of languages. We investigate the feasibility of developing small-vocabulary speaker-independent ASR systems designed for use in a telephone-based information system, using ten resource-scarce languages spoken in South Africa as a case study.

We contrast a cross-language transfer approach (using a welltrained system from a different language) with the development of new language-specific corpora and systems, and evaluate the effectiveness of both approaches. We find that limited speech corpora (3 to 8 hours of data from around 200 speakers) are sufficient for the development of reasonably accurate recognisers: Error rates are in the range 2% to 12% for a ten-word task, where vocabulary words are excluded from training to simulate vocabulary-independent performance. This approach is substantially more accurate than crosslanguage transfer, and sufficient for the development of basic spoken dialogue systems.


doi: 10.21437/Interspeech.2009-760

Cite as: Heerden, C.v., Barnard, E., Davel, M. (2009) Basic speech recognition for spoken dialogues. Proc. Interspeech 2009, 3003-3006, doi: 10.21437/Interspeech.2009-760

@inproceedings{heerden09b_interspeech,
  author={Charl van Heerden and Etienne Barnard and Marelie Davel},
  title={{Basic speech recognition for spoken dialogues}},
  year=2009,
  booktitle={Proc. Interspeech 2009},
  pages={3003--3006},
  doi={10.21437/Interspeech.2009-760}
}