Modeling Pronunciation Variation for Automatic Speech Recognition

Rolduc, The Netherlands
May 4-6, 1998

Visualizing Speech Trajectories

S. Douglas Peters, Peter Stubley

Nortel Technology, Montreal, Quebec, Canada

While most research into pronunciation variation focuses on the phonology of the problem, this paper examines the problem in the acoustic domain. The motivation for this is simple: phonetic boundaries imply an artificial quantization of a fundamentally continuous process. The acoustic feature space to be considered is that of the cepstrum-based features used in a state-of-the-art speech recognition system. As a result, pronunciation variation will be investigated from the "perspective" of an automatic speech recognizer. In order to better visualize the acoustic effects of pronunciation variation, a tool has been constructed to project the acoustic features onto a suitable viewing plane. Acoustic models are also projected onto the same plane in order to appreciate the mismatch between typical acoustic data and carefully trained continuous-density Gaussian mixture observation pdfs.

Full Paper

Bibliographic reference.  Peters, S. Douglas / Stubley, Peter (1998): "Visualizing speech trajectories", In MPV-1998, 97-102.