Sixth International Conference on Spoken Language Processing (ICSLP 2000)

Beijing, China
October 16-20, 2000

Unifying HMM and Phone-Pair Segment Models

Hsiao-Wuen Hon, Shankar Kumar, Kuansan Wang

Speech Technology Group, Microsoft Research, Redmond, WA, USA

It is well known that HMM is ineffective in modeling the dynamics of speech due to the piecewise stationary and the independent observation assumptions. In this paper, we propose an analytically tractable framework in which the two modeling techniques are combined to reach a jointly optimal decision in both training and recognition. The combination is achieved by coupling the hidden processes from the HMM and the segment model. To take the full advantage of the segmental approach, phone-pair units are used as the basic acoustic units for segment models. In addition, we construct context-dependent phone-pair units to account for acoustic variations in context. The superior quality of phone-pair segment models contributes to an 8.2% reduction in error rates on the WSJ dictation task.

Full Paper

Bibliographic reference.  Hon, Hsiao-Wuen / Kumar, Shankar / Wang, Kuansan (2000): "Unifying HMM and phone-pair segment models", In ICSLP-2000, vol.1, 286-289.