EUROSPEECH 2003 - INTERSPEECH 2003
Techniques for analysis of speech, that use autoregressive (all-pole) modeling approaches, are presented here and compared to generally known Mel-frequency cepstrum based feature extraction. In the paper, first, we focus on several possible applications of modeling speech power spectra that increase the performance of ASR system mainly in case of large mismatch between training and testing data. Then, the attention is payed to the different types of features that can be extracted from all-pole model to reduce the overall word error rate. The results show that generally used cepstrum based features, which can be easily extracted from all-pole model, are not the most suitable parameters for ASR, where the input speech is corrupted by different types of real noises. Very good recognition performances were achieved e.g., with discrete or selective all-pole modeling based approaches, or with decorrelated line spectral frequencies. The feature extraction techniques were tested on SpeechDat-Car databases used for front-end evaluation of advanced distributed speech recognition (DSR) systems.
Bibliographic reference. Motlicek, Petr / Cernocký, Jan (2003): "Autoregressive modeling based feature extraction for Aurora3 DSR task", In EUROSPEECH-2003, 1801-1804.