Third International Conference on Spoken Language Processing (ICSLP 94)

Yokohama, Japan
September 18-22, 1994

Modelling Syllable Characteristics to Improve a Large Vocabulary Continuous Speech Recogniser

M. Jones, Phil C. Woodland

Cambridge University Engineering Department, Cambridge, UK

The acoustic-phonetic modelling used in state-of-the-art large vocabulary continuous speech recognisers (LVCSR) cannot effectively exploit the prosody based distinctions known to exist at the syllable level. These distinctions are between the strength of the syllable (strong or weak) and the stress (stressed or unstressed) it is given. This paper shows how a small set of syllable-sized Hidden Markov Models (HMMs) can model syllable type effectively. These models have been applied to a large vocabulary continuous speech recogniser and a 23% reduction in word error rate was achieved.

Full Paper

Bibliographic reference.  Jones, M. / Woodland, Phil C. (1994): "Modelling syllable characteristics to improve a large vocabulary continuous speech recogniser", In ICSLP-1994, 2171-2174.