![]() |
Modeling Pronunciation Variation for Automatic Speech RecognitionRolduc, The Netherlands |
![]() ![]() |
While most research into pronunciation variation focuses on the phonology of the problem, this paper examines the problem in the acoustic domain. The motivation for this is simple: phonetic boundaries imply an artificial quantization of a fundamentally continuous process. The acoustic feature space to be considered is that of the cepstrum-based features used in a state-of-the-art speech recognition system. As a result, pronunciation variation will be investigated from the "perspective" of an automatic speech recognizer. In order to better visualize the acoustic effects of pronunciation variation, a tool has been constructed to project the acoustic features onto a suitable viewing plane. Acoustic models are also projected onto the same plane in order to appreciate the mismatch between typical acoustic data and carefully trained continuous-density Gaussian mixture observation pdfs.
Bibliographic reference. Peters, S. Douglas / Stubley, Peter (1998): "Visualizing speech trajectories", In MPV-1998, 97-102.