Sixth International Conference on Spoken Language Processing
In the paper we investigate methods suitable for practical implementation in a recognition system that is to classify telephone input in form of isolated words/phrases belonging to large vocabularies with equiprobable entries, such as people names, city and local names, etc. Specifically for Czech language we propose a pronunciation lexicon with a prefix-stem-sufix arrangement combined with appropriate caching and pruning techniques and a 2-level (monophone and triphone) based classification. In experiments done with telephone speech containing items from a 5347-word city-name vocabulary we obtained 90.1 % recognition score in average time 645 ms per word. Acoustic models for these experiments have been trained on an only available multi-speaker database that was originally recorded by a microphone and later transferred over telephone lines and automatically realigned.
Bibliographic reference. Nouza, Jan (2000): "Telephone speech recognition from large lists of Czech words", In ICSLP-2000, vol.4, 394-397.