Second International Conference on Spoken Language Processing (ICSLP'92)
Banff, Alberta, Canada
This paper describes a system for accurate continuous speech recognition called "ATREUS/SSS-LR". A phoneme-context-dependent LR parser drives allophonic HMMs represented by a shared-state network automatically generated by the Successive State Splitting (SSS) algorithm. In this system, the SSS principle has also been applied to duration clustering: optimal clusters of phoneme-context-dependent durations are automatically generated independently of the HMnet-based allophonic classes.
ATREUS/SSS-LR achieved a phrase recognition rate of 93.2%, the best recognition result achieved in the 1000-word recognition experiments conducted at ATR. This recognition rate was obtained with a smaller beam width than used with discrete HMM(fuzzy vector quantization) and continuous mixture density HMM. This shows that the SSS-LR can realize both fast parsing and high accuracy.
Bibliographic reference. Nagai, Akito / Takami, Jun-Ichi / Sagayama, Shigeki (1992): "The SSS-LR continuous speech recognition system: integrating SSS-derived allophone models and a phoneme-context-dependent LR parser", In ICSLP-1992, 1511-1514.