![]() |
Modeling Pronunciation Variation for Automatic Speech RecognitionRolduc, The Netherlands |
![]() ![]() |
This paper addresses the problem of generating lexical word representations that properly represent natural pronunciation variations for the purpose of improved speech recognition accuracy. The current work is based on a procedure for data-driven optimisation of the pronunciation dictionary which creates a single baseform per word in the vocabulary, subject to a maximum likelihood (ML) criterion [1]. In the current approach, we extend the ML formulation in order to achieve optimal modelling of pronunciation variations. Since different words will not in general exhibit the same amount of pronunciation variation, the procedure allows words to be represented by a different number of baseforms. The method improves the sub-word description of the vocabulary words, and has been shown to improve recognition performance on the DARPA Resource Management (RM) task.
Bibliographic reference. Holter, Trym / Svendsen, Torbjorn (1998): "Maximum likelihood modelling of pronunciation variation", In MPV-1998, 63-66.