11th Annual Conference of the International Speech Communication Association

Makuhari, Chiba, Japan
September 26-30. 2010

Using Spectro-Temporal Features to Improve AFE Feature Extraction for ASR

Suman V. Ravuri, Nelson Morgan


Previous work has shown that spectro-temporal features reduce WER for automatic speech recognition under noisy conditions. The spectro-temporal framework, however, is not the only way to process features in order to reduce errors due to noise in the signal. The two-stage mel-warped Wiener filtering method used in the "Advanced Front End'' (AFE), now a standard front end for robust recognition, is another way. Since the spectro-temporal approach can be applied to a noise-reduced spectrum, we wanted to explore whether spectro-temporal features could improve the performance of AFE for ASR. We show that computing spectro-temporal features after AFE processing results in a 45% relative improvement compared to AFE in clean conditions and a 6% to 30% improvement in noisy conditions on the Aurora2 clean training setup.

Full Paper

Bibliographic reference.  Ravuri, Suman V. / Morgan, Nelson (2010): "Using spectro-temporal features to improve AFE feature extraction for ASR", In INTERSPEECH-2010, 1181-1184.