In speech recognition studies, vocal tract length normalization (VTLN) techniques are widely used to cancel age- and genderdifference. In VTLN, the distortion is often modeled as a linear transform in a cepstrum space; .c= Ac. In our previous study, the geometrical properties of A were discussed and it was shown that the matrix can be approximated as rotation matrix. In this study, a new method of better approximating A is proposed. Using eigenvalues of A, its quasi-rotational distortion is factorized into multiple rotation operations and multiple magnification operations. Using this method, the intrinsic ambiguity of the rotation angle used in our previous study is resolved. Instead, multiple rotation angles are introduced to understand better what kind of geometrical distortions A induces to cepstrum vectors. Experiments show the validity of the new method and a new speech feature is also derived by the new method.
Bibliographic reference. Saito, Daisuke / Minematsu, Nobuaki / Hirose, Keikichi (2008): "Decomposition of rotational distortion caused by VTL difference using eigenvalues of its transformation matrix", In INTERSPEECH-2008, 1361-1364.