EUROSPEECH 2003 - INTERSPEECH 2003
A new front-end processing scheme for robust speech recognition is proposed and evaluated on the multi-lingual Aurora 3 database. The front-end processing scheme consists of Mel-scaled spectral subtraction, speech segmentation, cepstral coefficient extraction, utterance-level frame dropping and eigenspace feature normalization. We also investigated performance on all language databases by post-processing features extracted by the ETSI advanced front-end with an additional eigenspace normalization module. This step consists in linear PCA matrix feature transformation followed by mean and variance normalization of the transformed cepstral coefficients. In speech recognition experiments, our proposed front-end yielded better than 16 percent relative error rate reduction over the ETSI front-end on the Finnish language database. Also, more than 6% in average relative error reduction was observed over all languages with the ETSI front-end augmented by eigenspace normalization.
Bibliographic reference. Yao, Kaisheng / Visser, Erik / Kwon, Oh-Wook / Lee, Te-Won (2003): "A speech processing front-end with eigenspace normalization for robust speech recognition in noisy automobile environments", In EUROSPEECH-2003, 9-12.