7th International Conference on Spoken Language Processing

September 16-20, 2002
Denver, Colorado, USA

Automatic Generation of Phonetic Transcriptions for Large Speech Corpora

Kris Demuynck (1), Tom Laureys (1), Steven Gillis (2)

(1) Katholieke Universiteit Leuven, Belgium; (2) University of Antwerp, Belgium

We describe a method for the automatic production of phonetic transcriptions in large speech corpora. First, we focus on the application of different techniques for the generation of pronunciation variants. Then, we explain the application of a speech recognition system for selecting the acoustically best matching phonetic transcription. The system is evaluated on different test sets selected from the Spoken Dutch Corpus, ranging from read-aloud text to spontaneous speech, and achieves promising first results.

Full Paper

Bibliographic reference.  Demuynck, Kris / Laureys, Tom / Gillis, Steven (2002): "Automatic generation of phonetic transcriptions for large speech corpora", In ICSLP-2002, 333-336.