ISCA Archive Interspeech 2013
ISCA Archive Interspeech 2013

Regularized MVDR spectrum estimation-based robust feature extractors for speech recognition

Md. Jahangir Alam, Patrick Kenny, Douglas O'Shaughnessy

In this paper, we present two robust feature extractors that use a regularized minimum variance distortionless response (RMVDR) spectrum estimator instead of the discrete Fourier transform-based direct spectrum estimator, used in many front-ends including the conventional MFCC, for estimating the speech power spectrum. Direct spectrum estimators, e.g., single tapered periodogram, have high variance and they perform poorly under noisy and adverse conditions. RMVDR spectrum estimator has low spectral variance and are robust to mismatch conditions. Based on RMVDR spectrum estimator two robust feature extractors, robust RMVDR cepstral coefficients (RRMCC) and normalized RMVDR cepstral coefficients (NRMCC), are proposed that incorporate an auditory domain spectrum enhancement (ASE) method and a medium duration power bias subtraction (MDPBS) technique, respectively, for enhancement of the speech spectrum. Experimental speech recognition results are conducted on the AURORA-4 corpus and performances are compared with the MFCC, PLP, MVDR-MFCC, RMVDR-MFCC, PMVDR, ETSI advancement front-end (ETSI-AFE), PNCC, CFCC, and the robust feature extractor (RFE) of [6]. Experimental results demonstrate that the proposed robust feature extractors outperformed the other robust front-ends in terms of percentage word accuracy on the AURORA-4 large vocabulary continuous speech recognition (LVCSR) task under different mismatch conditions.


doi: 10.21437/Interspeech.2013-262

Cite as: Alam, M.J., Kenny, P., O'Shaughnessy, D. (2013) Regularized MVDR spectrum estimation-based robust feature extractors for speech recognition. Proc. Interspeech 2013, 891-895, doi: 10.21437/Interspeech.2013-262

@inproceedings{alam13b_interspeech,
  author={Md. Jahangir Alam and Patrick Kenny and Douglas O'Shaughnessy},
  title={{Regularized MVDR spectrum estimation-based robust feature extractors for speech recognition}},
  year=2013,
  booktitle={Proc. Interspeech 2013},
  pages={891--895},
  doi={10.21437/Interspeech.2013-262}
}