ISCA Archive SLTU 2012
ISCA Archive SLTU 2012

Resource development and experiments in automatic south african broadcast news transcription

Herman Kamper, Febe de Wet, Thomas Hain, Thomas Niesler

We present a description of the development and evaluation of a first South African broadcast news transcription system. We describe a number of speech resources which have been collected in the resource-scarce South African environment for system development purposes: a 20 hour corpus of South African English (SAE) broadcast news; a 109M word corpus of South African newspaper text collected for language modelling purposes; and a 60k word SAE pronunciation dictionary. The development of our system is based on similar state-of-the-art broadcast news transcription systems and uses cross-word triphone HMMs, MF-PLP features and per-segment cepstral mean and per-bulletin cepstral variance normalisation. Our final system achieves a word error rate of 24.6%. We find that reasonable performance is achieved on newsreader speech while poor performance is achieved on spontaneous and telephone speech in our test data. Finally, we consider the recognition of MP3-compressed audio and show that performance deteriorates only at low bit-rates.

Index Terms: Broadcast news transcription, South African English, under-resourced languages, English accents


Cite as: Kamper, H., Wet, F.d., Hain, T., Niesler, T. (2012) Resource development and experiments in automatic south african broadcast news transcription. Proc. 3rd Workshop on Spoken Language Technologies for Under-Resourced Languages (SLTU 2012), 102-106

@inproceedings{kamper12_sltu,
  author={Herman Kamper and Febe de Wet and Thomas Hain and Thomas Niesler},
  title={{Resource development and experiments in automatic south african broadcast news transcription}},
  year=2012,
  booktitle={Proc. 3rd Workshop on Spoken Language Technologies for Under-Resourced Languages (SLTU 2012)},
  pages={102--106}
}