Sixth European Conference on Speech Communication and Technology
(EUROSPEECH'99)

Budapest, Hungary
September 5-9, 1999

HMM Composition of Segmental Unit Input HMM for Noisy Speech Recognition

Kazumasa Yamamoto, Seiichi Nakagawa

Toyohashi University of Technology, Department of Information and Computer Sciences Tenpaku-cho, Toyohashi, Japan

For robust speech recognition in noisy environments, various methods have been studied. In this paper, we apply parallel model combination (PMC) for segmental unit input HMM to recognize corrupted speech in additive noise. Since several successive frames are combined and treated as an input vector in segmental unit input modeling, the increased dimension of vector degrades the precision in estimating covariance matrices. Therefore Karhunen-Loeve expansion or LDA is used to reduce the dimension. Thus the inverse transformation of segmental statistics to cepstral domain is needed and correlations between frames have to be taken into account. We expanded the original PMC to segmental unit input HMM. Experimental results showed PMC for segmental unit input HMM proposed here gives better recognition performance than the original PMC.


Full Paper (PDF)   Gnu-Zipped Postscript

Bibliographic reference.  Yamamoto, Kazumasa / Nakagawa, Seiichi (1999): "HMM composition of segmental unit input HMM for noisy speech recognition", In EUROSPEECH'99, 2865-2868.