This paper investigates several different methods for performing vocal tract length normalisation (VTLN) which are either completely linear or piece-wise linear. Furthermore the combination of VTLN with either standard unconstrained maximum likelihood linear regression (MLLR) or constrained MLLR is considered. Results on the Switchboard corpus show that there is little difference in performance between the different forms of VTLN, and that as previously reported that the effects of VTLN and unconstrained MLLR are largely additive. However it was found that if multiple iterations of constrained MLLR is used there is no additional advantage to also using VTLN.
Cite as: Uebel, L.F., Woodland, P.C. (1999) An investigation into vocal tract length normalisation. Proc. 6th European Conference on Speech Communication and Technology (Eurospeech 1999), 2527-2530, doi: 10.21437/Eurospeech.1999-553
@inproceedings{uebel99_eurospeech, author={L. F. Uebel and P. C. Woodland}, title={{An investigation into vocal tract length normalisation}}, year=1999, booktitle={Proc. 6th European Conference on Speech Communication and Technology (Eurospeech 1999)}, pages={2527--2530}, doi={10.21437/Eurospeech.1999-553} }