EUROSPEECH 2003 - INTERSPEECH 2003
8th European Conference on Speech Communication and Technology

Geneva, Switzerland
September 1-4, 2003

        

Robust Multiple Resolution Analysis for Automatic Speech Recognition

Roberto Gemello (1), Franco Mana (1), Dario Albesano (1), Renato De Mori (2)

(1) Loquendo, Italy
(2) LIA-CNRS, France

This paper investigates the potential of exploiting the redundancy implicit in Multi Resolution Analysis (MRA) for Automatic Speech Recognition (ASR) systems. Experiments, carried with data collected from home telephones and in cars, confirm the proposed approach for exploiting this redundancy.

Comparisons with the use of Mel Frequency-scaled Cepstral Coefficients (MFCC)s, JRASTA Perceptual Linear Prediction Coefficients (JRASTAPLP) indicate that executing Principal Component Analysis (PCA) on MRA features result in performance superior to the use of MFCCs and competitive with the use of JRASTAPLP features. Experiments in noisy conditions, using the Italian component of the AURORA3 corpus, show a WER reduction of 15.7% when SNR-dependent Spectral Subtraction (SS) is performed on MRA-PCA features compared to when it is performed on JRASTAPLP features. Furthermore, SS appears to be better than Soft Thresholding (ST).

Full Paper

Bibliographic reference.  Gemello, Roberto / Mana, Franco / Albesano, Dario / Mori, Renato De (2003): "Robust multiple resolution analysis for automatic speech recognition", In EUROSPEECH-2003, 3033-3036.