ISCA Archive ICSLP 1994
ISCA Archive ICSLP 1994

ARDOSS: autoregressive domain spectral subtraction for robust speech recognition in additive noise

Hugo Van Hamme

The first and second order statistics of the LPC parameters of speech corrupted by additive noise are predicted based on the first few lags of the autocorrelation of the noise. The computed mean allows a correction on the LPC parameters without reference to an assumed state and for any type of HMM emission models. This mean is equivalent to a 5 dB noise suppression. Additional robustness is gained when the predicted covariance in the AR-domain is transposed to the cepstral domain to correct the emission probabilities in a single-Gaussian HMM. These conclusions are drawn from speaker-dependent experiments on the NOISEX-92 database. For a p-th order LPC analysis, correction of the mean is accomplished in O(p2) floating point operations (flops). The full covariance correction requires O(/>3) flops. An O(p2)-approximation that yields comparable performance in practice is given.


Cite as: Hamme, H.V. (1994) ARDOSS: autoregressive domain spectral subtraction for robust speech recognition in additive noise. Proc. 3rd International Conference on Spoken Language Processing (ICSLP 1994), 1019-1022

@inproceedings{hamme94_icslp,
  author={Hugo Van Hamme},
  title={{ARDOSS: autoregressive domain spectral subtraction for robust speech recognition in additive noise}},
  year=1994,
  booktitle={Proc. 3rd International Conference on Spoken Language Processing (ICSLP 1994)},
  pages={1019--1022}
}