ISCA Archive SLTU 2014
ISCA Archive SLTU 2014

Development of a Korean speech recognition system with little annotated data

Antoine Laurent, Lori Lamel

This paper investigates the development of a speech-totext transcription system for the Korean language in the context of the DGA RAPID Rapmat project. Korean is an alphasyllabary language spoken by about 78 million people worldwide. As only a small amount of manually transcribed audio data were available, the acoustic models were trained on audio data downloaded from several Korean websites in an unsupervised manner, and the language models were trained on web texts. The reported word and character error rates are estimates, as development corpus used in these experiments was also constructed from the untranscribed audio data, the web texts and automatic transcriptions. Several variants for unsupervised acoustic model training were compared to assess the influence of the vocabulary size (200k vs 2M), the type of language model (words vs characters), the acoustic unit (phonemes vs half-syllables), as well as incremental batch vs iterative decoding of the untranscribed audio corpus.

Index Terms: Speech recognition system, unsupervised acoustic training, korean, approximative transcripts


Cite as: Laurent, A., Lamel, L. (2014) Development of a Korean speech recognition system with little annotated data. Proc. 4th Workshop on Spoken Language Technologies for Under-Resourced Languages (SLTU 2014), 146-152

@inproceedings{laurent14_sltu,
  author={Antoine Laurent and Lori Lamel},
  title={{Development of a Korean speech recognition system with little annotated data}},
  year=2014,
  booktitle={Proc. 4th Workshop on Spoken Language Technologies for Under-Resourced Languages  (SLTU 2014)},
  pages={146--152}
}