Third International Conference on Spoken Language Processing (ICSLP 94)
One of the sources of difficulty in speech recognition and understanding is the variability due to alternate pronunciations of words. To address the issue we have investigated the use of multiple-pronunciation models (MPMs) in the decoding stage of a speaker-independent speech understanding system. In this paper we address three important issues regarding MPMs: (a) Model construction: How can MPMs be built from available data without human intervention? (b) Model embedding: How should MPM construction interact with the training of the sub-word unit models on which they are based? (c) Utility: Do they help in speech recognition? Automatic, data-driven MPM construction is accomplished using a structural HMM induction algorithm. The resulting MPMs are trained jointly with a multi-layer perceptron functioning as a phonetic likelihood estimator. The experiments reported here demonstrate that MPMs can significantly improve speech recognition results over standard single pronunciation models.
Bibliographic reference. Wooters, Chuck / Stolcke, Andreas (1994): "Multiple-pronunciation lexical modeling in a speaker independent speech understanding system", In ICSLP-1994, 1363-1366.