Second International Conference on Spoken Language Processing (ICSLP'92)
Banff, Alberta, Canada
Speech and noise process are considered as cluster of points in a multidimensional spectral space. The effects of noise on a signal cluster is to increase the variance and to move the centroid in the general direction of noise. The degradations in observation likelihood is expressed in terms of the mean and variance of noise spectrum.
In many cases of erroneous speech recognition the correct model is among the few with the highest scores. This observation forms the basis for re-evaluation of the few high scoring candidates before making a final decision. The noisy input signal is filtered by state dependent Wiener filters derived from the most likely state sequence of each HMM model. A revised score for each candidate model and the filtered signal is calculated. The model with the highest score is selected. This method achieves on average 30% reduction in recognition error compared to uncompensated scheme.
Bibliographic reference. Vaseghi, Saeed V. / Milner, Ben P. (1992): "Speech recognition in noisy environments", In ICSLP-1992, 1487-1490.