8th Annual Conference of the International Speech Communication Association

Antwerp, Belgium
August 27-31, 2007

Linear Transformation Approach to VTLN Using Dynamic Frequency Warping

D. R. Sanand, D. Dinesh Kumar, S. Umesh

Indian Institute of Technology Kanpur, India

In the paper, we present a novel linear transformation approach to frequency warping during vocal tract length normalisation (VTLN) using the idea of dynamic frequency warping (DFW). Linear transformation among the mel-frequency cepstral coefficients (MFCC) provides computational advantage of not having to recompute features for each warp factor in VTLN. The proposed method uses the idea of separating the smoothing and the frequency warping operations in the feature extraction stage unlike the conventional approach where both operations are integrated into the filter-bank operation. The advantage of the proposed DFW approach is that, we can obtain a transformation matrix for any arbitrary warping even when we do not know the functional form or mapping of the warping function. We compare the performance of the proposed method along with approaches proposed in [4] and [5] on one phone classification and two digit recognition tasks.

Full Paper

Bibliographic reference.  Sanand, D. R. / Kumar, D. Dinesh / Umesh, S. (2007): "Linear transformation approach to VTLN using dynamic frequency warping", In INTERSPEECH-2007, 1138-1141.