Interspeech'2005 - Eurospeech
We present a system developed for fully automated processing of Czech spoken broadcast programs. It includes modules for unsupervised segmentation of audio stream, speaker and gender recognition followed by speaker adaptation, and own speech decoder designed for extremely large vocabularies. Compared to our previous results reported in 2004, the new system reduced the WER (evaluated on the Czech part of the European COST Broadcast News Database) from 28.5% to 18.4%. This significant improvement was accomplished namely due to the larger lexicon (312K) with multiple text and pronunciation variants and multi-word entries, speaker and gender adapted acoustic matching and improved language modeling. Besides the results achieved in the Broadcast News task we refer also about the performance in other similar jobs, like the transcription of a talk show or parliament speech.
Bibliographic reference. Nouza, Jan / Zdánský, Jindrich / David, Petr / Cerva, Petr / Kolorenc, Jan / Nejedlová, Dana (2005): "Fully automated system for Czech spoken broadcast transcription with very large (300k+) lexicon", In INTERSPEECH-2005, 1681-1684.