EUROSPEECH 2003 - INTERSPEECH 2003
8th European Conference on Speech Communication and Technology

Geneva, Switzerland
September 1-4, 2003

        

A Speech Processing Front-End with Eigenspace Normalization for Robust Speech Recognition in Noisy Automobile Environments

Kaisheng Yao, Erik Visser, Oh-Wook Kwon, Te-Won Lee

University of California at San Diego, USA

A new front-end processing scheme for robust speech recognition is proposed and evaluated on the multi-lingual Aurora 3 database. The front-end processing scheme consists of Mel-scaled spectral subtraction, speech segmentation, cepstral coefficient extraction, utterance-level frame dropping and eigenspace feature normalization. We also investigated performance on all language databases by post-processing features extracted by the ETSI advanced front-end with an additional eigenspace normalization module. This step consists in linear PCA matrix feature transformation followed by mean and variance normalization of the transformed cepstral coefficients. In speech recognition experiments, our proposed front-end yielded better than 16 percent relative error rate reduction over the ETSI front-end on the Finnish language database. Also, more than 6% in average relative error reduction was observed over all languages with the ETSI front-end augmented by eigenspace normalization.

Full Paper

Bibliographic reference.  Yao, Kaisheng / Visser, Erik / Kwon, Oh-Wook / Lee, Te-Won (2003): "A speech processing front-end with eigenspace normalization for robust speech recognition in noisy automobile environments", In EUROSPEECH-2003, 9-12.