This paper proposes a continuous speech recognition strategy that uses acoustic non-uniform unit based Hidden Markov models and stochastic language models. The non-uniform units consist of phoneme units and long-units which cover contiguous phonemes. The long-unit is introduced to cope with more fluent acoustic characteristics which frequently occur in target speech, and this unit is also used as a non-uniform unit n-gram model for transparency in the integration of acoustic and linguistic processing. Phrase recognition experiments have shown the non-uniform units to be effective in achieving an error reduction rate of 11% compared with conventional context-dependent phone models, and by adding a non-uniform trigram the error reduction rate becomes 27%, showing the effectiveness of this strategy.
Bibliographic reference. Matsunaga, Shoichi / Matsumura, Takeshi / Singer, Harald (1995): "Continuous speech recognition using non-uniform unit based acoustic and language models", In EUROSPEECH-1995, 1619-1622.