7th International Conference on Spoken Language Processing

September 16-20, 2002
Denver, Colorado, USA

Robust Multiple Resolution Analysis for Automatic Speech Recognition

Roberto Gemello (1), Franco Mana1, Paolo Pegoraro (1), Renato De Mori (2)

(1) LOQUENDO, Italy; (2) LIA CERI-IUP, France

This paper describes the use of denoising techniques in the time domain applied to the outputs of filters corresponding to a Multi Resolution Analysis. The fact that energies of denoised samples are used for Automatic Speech Recognition (ASR) makes soft thresholding particularly attractive especially if Principal Component Analysis (PCA) is applied to the whole tree of energy features. This consideration is supported by experimental results on a very large test set including many speakers uttering proper names from different locations of the Italian public telephone network. The results show that soft thresholding outperforms J-Rasta PLP with a WER reduction, after denoising, of 26%.


Full Paper

Bibliographic reference.  Gemello, Roberto / Mana1, Franco / Pegoraro, Paolo / Mori, Renato De (2002): "Robust multiple resolution analysis for automatic speech recognition", In ICSLP-2002, 2201-2204.