8th International Conference on Spoken Language Processing

Jeju Island, Korea
October 4-8, 2004

Integration of Articulatory Dynamic Parameters in HMM/BN based Speech Recognition System

Konstantin Markov, Satoshi Nakamura, Jianwu Dang

Advanced Telecommunication Research Institute International, Japan

In this paper, we describe several approaches to integration of the articulatory dynamic parameters along with articulatory position data into HMM/BN model based automatic speech recognition system. This work is a continuation of our previous study, where we have successfully combined speech acoustic features in form of MFCC with articulatory position observations. Articulatory dynamic parameters are represented by velocity and acceleration coefficients. All these features are integrated using the HMM/BN acoustic model where each feature corresponds to different Bayesian Network variable. By changing the BN topology we can change the way articulatory and acoustic parameters are combined. The evaluation experiments showed that the effect of the articulatory dynamic features greatly depends on the BN structure and that careful data analysis is essential in gaining knowledge about the data dependencies. In comparison with conventional HMM system trained on acoustic data only, the HMM/BN system achieved significant improvement of the recognition performance.

Full Paper

Bibliographic reference.  Markov, Konstantin / Nakamura, Satoshi / Dang, Jianwu (2004): "Integration of articulatory dynamic parameters in HMM/BN based speech recognition system", In INTERSPEECH-2004, 561-564.