INTERSPEECH 2008
9th Annual Conference of the International Speech Communication Association

Brisbane, Australia
September 22-26, 2008

Integration of Metamodel and Acoustic Model for Speech Recognition

Hironori Matsumasa (1), Tetsuya Takiguchi (1), Yasuo Ariki (1), Ichao Li (2), Toshitaka Nakabayashi (1)

(1) Kobe University, Japan; (2) Otemon Gakuin University, Japan

We investigated the speech recognition of a person with articulation disorders resulting from athetoid cerebral palsy. The articulation of the first speech tends to become unstable due to strain on speech-related muscles, and that causes degradation of speech recognition. Therefore, we proposed a robust feature extraction method based on PCA (Principal Component Analysis) instead of MFCC [1]. In this paper, we discuss our effort to integrate a Metamodel [2] and Acoustic model approach. Meta-model has a technique for incorporating a model of a speaker's confusion matrix into the ASR process in such a way as to increase recognition accuracy. Its effectiveness has been confirmed by word recognition experiments.

References

  1. H. Matsumasa and T. Takiguchi and Y. Ariki and I. LI and T. Nakabayashi, "PCA-Based Feature Extraction for Fluctuation in Speaking Style of Articulation Disorders," INTERSPEECH-2007, pp. 1150-1153, 2007 (ISCA Archive, http://www.isca-speech.org/archive/interspeech_2007)
  2. O. C. Morales and S. Cox.."Modelling Confusion Matrices to Improve Speech Recognition Accuracy, with an Application to Dysarthric Speech," INTERSPEECH-2007, pp. 1565-1568, 2007. (ISCA Archive, http://www.isca-speech.org/archive/interspeech_2007)

Full Paper

Bibliographic reference.  Matsumasa, Hironori / Takiguchi, Tetsuya / Ariki, Yasuo / Li, Ichao / Nakabayashi, Toshitaka (2008): "Integration of metamodel and acoustic model for speech recognition", In INTERSPEECH-2008, 2234-2237.