ISCA Archive Interspeech 2007
ISCA Archive Interspeech 2007

Optimization of temporal filters in the modulation frequency domain for constructing robust features in speech recognition

Jeih-weih Hung

In this paper, we derive new data-driven temporal filters that employ the statistics of the modulation spectra of the speech features. The new temporal filtering approaches are based on the constrained version of Principal Component Analysis (C-PCA) and Maximum Class Distance (C-MCD), respectively. It is shown that the proposed C-PCA and C-MCD temporal filters can effectively improve the speech recognition accuracy in various noise corrupted environments. In experiments conducted on Test Set A of the Aurora-2 noisy digits database, these new temporal filters, together with cepstral mean and variance normalization (CMVN), provides average relative error reduction rates of over 40% and 27%, when compared with the baseline MFCC processing and CMVN alone, respectively.


doi: 10.21437/Interspeech.2007-111

Cite as: Hung, J.-w. (2007) Optimization of temporal filters in the modulation frequency domain for constructing robust features in speech recognition. Proc. Interspeech 2007, 1090-1093, doi: 10.21437/Interspeech.2007-111

@inproceedings{hung07_interspeech,
  author={Jeih-weih Hung},
  title={{Optimization of temporal filters in the modulation frequency domain for constructing robust features in speech recognition}},
  year=2007,
  booktitle={Proc. Interspeech 2007},
  pages={1090--1093},
  doi={10.21437/Interspeech.2007-111}
}