Third International Conference on Spoken Language Processing (ICSLP 94)
This paper presents a frequency-weighted Hidden Markov Model (HMM) for noisy speech recognition. In this HMM, the covariance matrices of Gaussian probability density functions are fixed to the inverse of frequency weithing matrices in order to utilize the robustness of quefrency weighted cepstrum and also to incorporate their relative perceptual importance in frequency domain into HMM. Two types of frequency weighting functions and the scaling methods of frequency weighting matrices are examined Using NOISEX-92 data base. As a result of ten digit word recognition tests, the 0.3 to 0.5th power of the smoothed power spectrum derived from each mean vector with a normalization factor are found to give the most robust HMM. A comparative experiments showed that the frequency-weighted HMM attained SNR gains of 12 dB, 6 dB, and 3 dB, 2 dB over a standard diagonal HMM for white, pink, car, and Linx noises. Furthermore, it was found that a duration control is important in the frequency-weighted HMM.
Bibliographic reference. Matsumoto, Hiroshi / Imose, Hiroyuki (1994): "A frequency-weighted continuous density HMM for noisy speech recognition", In ICSLP-1994, 1007-1010.