7th International Conference on Spoken Language Processing

September 16-20, 2002
Denver, Colorado, USA

Comparison and Combination of RASTA-PLP and FF Features in a Hybrid HMM/MLP Speech Recognition System

Pere Pujol Marsal (1), Susagna Pol Font (1), Astrid Hagen (2), Hervé Bourlard (3), Climent Nadeu (1)

(1) Universitat Politècnica de Catalunya, Spain; (2) INESC, Portugal; (3) Dalle Molle Institute for Perceptual Artificial Intelligence, Switzerland

Recently, the advantages of the spectral parameters obtained by frequency filtering (FF) of the logarithmic filter bank energies (logFBEs) have been reported. These parameters, which are frequency derivatives of the logFBEs, lie in the frequency domain, and have shown good recognition performance with respect to the conventional melfrequency cepstral coefficients (MFCC) for HMM systems. In this paper, the FF features are compared with the MFCCs and the Rasta-PLP features in the framework of a hybrid HMM/MLP recognition system, for both clean and noisy speech.

Taking advantage of the ability of the hybrid system to deal with correlated features, the inclusion of the second frequency derivatives and the raw logFBEs as additional features is proposed. Furthermore, in order to enhance the robustness of these features in noisy conditions, they are combined with the Rasta temporal filtering approach. Finally, a study of the FF in the framework of multistream processing is presented. From the experimental tests, it appears that the new spectral parameters and the tested combinations yield an enhanced recognition performance.

