ISCA Archive Interspeech 2005
ISCA Archive Interspeech 2005

PROSPECT features and their application to missing data techniques for vocal tract length normalization

Wim Jansen, Hugo Van Hamme

Speaker normalization by (piecewise) linear warping of the frequency axis is a popular method because of its simplicity and effectiveness. However, when this so-called vocal tract length normalization is applied to map test speakers with a shorter vocal tract onto acoustic models trained on speakers with a longer vocal tract, there is important information missing in the frequency bins at the high end of the spectrum. Usually, this missing information is reconstructed by ad hoc rules or through extrapolation of the spectrum. In this paper, we present a new method to estimate the content of those bins. The proposed solution is derived from Missing Data Techniques, that are used for noise robust speech recognizers. To alleviate the accuracy loss associated with Missing Data Techniques that are usually expressed in the spectral domain, we apply the PROSPECT feature representation introduced about a year ago. We demonstrate the superiority of our approach on the TIDigits database.


doi: 10.21437/Interspeech.2005-703

Cite as: Jansen, W., Hamme, H.V. (2005) PROSPECT features and their application to missing data techniques for vocal tract length normalization. Proc. Interspeech 2005, 2753-2756, doi: 10.21437/Interspeech.2005-703

@inproceedings{jansen05_interspeech,
  author={Wim Jansen and Hugo Van Hamme},
  title={{PROSPECT features and their application to missing data techniques for vocal tract length normalization}},
  year=2005,
  booktitle={Proc. Interspeech 2005},
  pages={2753--2756},
  doi={10.21437/Interspeech.2005-703}
}