This paper describes the 3.5-years effort put into building LVCSR systems for recognition of spontaneous speech of Czech, Russian, and Slovak witnesses of the Holocaust in the MALACH project. For processing of colloquial, highly emotional and heavily accented speech of elderly people containing many non-speech events we have developed techniques that very effectively handle both non-speech events and colloquial and accented variants of uttered words. Manual transcripts as one of the main sources for language modeling were automatically "normalized" using standardized lexicon, which brought about 2 to 3% reduction of the word error rate (WER). The subsequent interpolation of such LMs with models built from an additional collection (consisting of topically selected sentences from general text corpora) resulted into an additional improvement of performance of up to 3%.
Cite as: Psutka, J., Ircing, P., Psutka, J.V., Hajic, J., Byrne, W.J., Mírovský, J. (2005) Automatic transcription of Czech, Russian, and Slovak spontaneous speech in the MALACH project. Proc. Interspeech 2005, 1349-1352, doi: 10.21437/Interspeech.2005-489
@inproceedings{psutka05_interspeech, author={Josef Psutka and Pavel Ircing and J. V. Psutka and Jan Hajic and William J. Byrne and Jirí Mírovský}, title={{Automatic transcription of Czech, Russian, and Slovak spontaneous speech in the MALACH project}}, year=2005, booktitle={Proc. Interspeech 2005}, pages={1349--1352}, doi={10.21437/Interspeech.2005-489} }