Speech model combination with the background noise has been shown effective to improve the pattern classification rate of noisy speech. The model combination can be performed by the addition of the spectral statistics such as the means and the variances. Since the speech feature for pattern classification has to be expressed in the cepstral domain, the combined spectral statistics have to be transferred into the cepstral domain for speech recognition. In our previous study, we have proposed a direct adaptation scheme of the cepstral variance that is without the mapping from the spectral domain to the cepstral domain. In this paper, an improved version to perform the adaptation is proposed. From the study, it is observed that the adapted variance can be expressed as a linear interpolation of the speech and the noise variances to obtain a comparable recognition rate that is obtained with the mapping process. Due to the direct adaptation of the variances, a lot of computation can be reduced to perform the environmental adaptation.
Cite as: Hwang, T.-H., Yuo, K.-H., Wang, H.-C. (2001) Linear interpolation of cepstral variance for noisy speech recognition. Proc. 7th European Conference on Speech Communication and Technology (Eurospeech 2001), 877-880, doi: 10.21437/Eurospeech.2001-267
@inproceedings{hwang01_eurospeech, author={Tai-Hwei Hwang and Kuo-Hwei Yuo and Hsiao-Chuan Wang}, title={{Linear interpolation of cepstral variance for noisy speech recognition}}, year=2001, booktitle={Proc. 7th European Conference on Speech Communication and Technology (Eurospeech 2001)}, pages={877--880}, doi={10.21437/Eurospeech.2001-267} }