Sixth International Conference on Spoken Language Processing
A modified parallel model combination (PMC) for noisy speech recognition is proposed such that both speech cepstral mean and variance are adapted without the mapping of variance between cepstral and log-spectral domains. By investigating an adapted scalar random variable of log-energy in the way of PMC, we observe that the adapted variance of log-energy can be roughly predicted by the energy ratio of source signals. Based on the observation, we propose that the cepstral variance of the adapted model can be approximated according to the local signal-to-noise ratio (SNR) of a state. The combined cepstral variance is then assigned to be the variance of clean speech, the variance of noise, or the average variance of clean speech and noise. The performance of using this approximation method is compared with the original PMC. Our experiment shows that the degradation of the performance is small, but the proposed method has greatly reduced the computational cost as comparing with the PMC method.
Bibliographic reference. Hwang, Tai-Hwei / Yuo, Kuo-Hwei / Wang, Hsiao-Chuan (2000): "Speech model compensation with direct adaptation of cepstral variance to noisy environment", In ICSLP-2000, vol.4, 366-369.