Fourth European Conference on Speech Communication and Technology

Madrid, Spain
September 18-21, 1995

Continuous Speech Recognition Using Non-Uniform Unit Based Acoustic and Language Models

Shoichi Matsunaga, Takeshi Matsumura, Harald Singer

ATR Interpreting Telecommunications Research Laboratories, Soraku-gun, Kyoto, Japan

This paper proposes a continuous speech recognition strategy that uses acoustic non-uniform unit based Hidden Markov models and stochastic language models. The non-uniform units consist of phoneme units and long-units which cover contiguous phonemes. The long-unit is introduced to cope with more fluent acoustic characteristics which frequently occur in target speech, and this unit is also used as a non-uniform unit n-gram model for transparency in the integration of acoustic and linguistic processing. Phrase recognition experiments have shown the non-uniform units to be effective in achieving an error reduction rate of 11% compared with conventional context-dependent phone models, and by adding a non-uniform trigram the error reduction rate becomes 27%, showing the effectiveness of this strategy.

Full Paper

Bibliographic reference.  Matsunaga, Shoichi / Matsumura, Takeshi / Singer, Harald (1995): "Continuous speech recognition using non-uniform unit based acoustic and language models", In EUROSPEECH-1995, 1619-1622.