11th Annual Conference of the International Speech Communication Association

Makuhari, Chiba, Japan
September 26-30. 2010

Revisiting VTLN Using Linear Transformation on Conventional MFCC

Doddipatla Rama Sanand, Ralf Schlüter, Hermann Ney

RWTH Aachen University, Germany

In this paper, we revisit the linear transformation for VTLN on conventional MFCC proposed by Sanand et. al. in [Interspeech 2008], using the idea of band-limited interpolation. The filter-bank is modified to include half-filters at zero and nyquist frequencies, as the full symmetric spectrum is required for performing band-limited interpolation. In this paper, we show that the filter-bank with half-filters does not affect the recognition performance on clean speech (also shown in [Interspeech 2008]), but does affect the recognition performance on noisy speech. This motivated us to revisit the linear transformation for VTLN in [Interspeech 2008] and propose modifications to undo the affect of half-filters during the feature extraction. We show through recognition experiments that the proposed modifications to the linear transformation have comparable performance as the conventional VTLN approach, still enabling us to perform VTLN using a linear transformation on conventional MFCC.

Full Paper

Bibliographic reference.  Sanand, Doddipatla Rama / Schlüter, Ralf / Ney, Hermann (2010): "Revisiting VTLN using linear transformation on conventional MFCC", In INTERSPEECH-2010, 538-541.