Third International Conference on Spoken Language Processing (ICSLP 94)

Yokohama, Japan
September 18-22, 1994

ARDOSS: Autoregressive Domain Spectral Subtraction for Robust Speech Recognition in Additive Noise

Hugo Van Hamme

Lernout & Hauspie Speech Products N.V., Wemmel, Belgium

The first and second order statistics of the LPC parameters of speech corrupted by additive noise are predicted based on the first few lags of the autocorrelation of the noise. The computed mean allows a correction on the LPC parameters without reference to an assumed state and for any type of HMM emission models. This mean is equivalent to a 5 dB noise suppression. Additional robustness is gained when the predicted covariance in the AR-domain is transposed to the cepstral domain to correct the emission probabilities in a single-Gaussian HMM. These conclusions are drawn from speaker-dependent experiments on the NOISEX-92 database. For a p-th order LPC analysis, correction of the mean is accomplished in O(p2) floating point operations (flops). The full covariance correction requires O(/>3) flops. An O(p2)-approximation that yields comparable performance in practice is given.

Full Paper

Bibliographic reference.  Hamme, Hugo Van (1994): "ARDOSS: autoregressive domain spectral subtraction for robust speech recognition in additive noise", In ICSLP-1994, 1019-1022.