![]() |
ETRW on Speech Processing in Adverse ConditionsCannes-Mandelieu, France |
![]() |
We address the problem of incorporating frequency weighting into a stochastic modeling framework for robust speech recognition. First, this paper introduces frequency-weighted Euclidean distances weighted by a smoothed reference power spectrum. Then, on the basis of this distance measure, a frequency-weighted continuous density HMM is proposed in which the covariances are proportional to the spectral power in a frequency domain. Using spectral parameters of group delay spectra or spectral slope (RPS) and their time derivatives, frequency-weighting by a global power spectrum was confirmed to significantly improve the recognition accuracy for the RPS from 68.9 % to 91.6 % at a low SNR of 6 dB with added white noise. Furthermore, it was found that the frequency-weighted HMM attained a high recognition accuracy of 77.3 % in multi-speaker word recognition at a SNR of 12 dB, gaining 42.8 % in accuracy compared to the standard HMM.
Bibliographic reference. Matsumoto, Hiroshi (1992): "A frequency-weighted euclidean distance and its application to HMM-based recognition of noisy speech", In SPAC-1992, 103-106.