Selecting good feature is especially important to achieve high speech recognition accuracy. Although the mel-cepstrum is a popular and effective feature for speech recognition, it is still unclear that the filter-bank in the mel-cepstrum is always optimal regardless of speech recognition environments or the characteristics of specific speech data. In this paper, we focus on the data-driven filter-bank optimization for a new feature extraction where we use the Kullback-Leibler (KL) distance as the measure in the filter-bank design. Experimental results showed that the proposed feature provides an error rate reduction of about 20% for clean speech as well as noisy speech compared to the conventional mel-cepstral feature.
Cite as: Suh, Y., Kim, H.-R. (2004) Data-driven filter-bank-based feature extraction for speech recognition. Proc. 9th Conference on Speech and Computer (SPECOM 2004), 154-157
@inproceedings{suh04_specom, author={Youngjoo Suh and Hoi-Rin Kim}, title={{Data-driven filter-bank-based feature extraction for speech recognition}}, year=2004, booktitle={Proc. 9th Conference on Speech and Computer (SPECOM 2004)}, pages={154--157} }