EUROSPEECH 2003 - INTERSPEECH 2003
8th European Conference on Speech Communication and Technology

Geneva, Switzerland
September 1-4, 2003

        

Tracking Vocal Tract Resonances Using an Analytical Nonlinear Predictor and a Target-Guided Temporal Constraint

Li Deng, Issam Bazzi, Alex Acero

Microsoft Research, USA

A technique for high-accuracy tracking of formants or vocal tract resonances is presented in this paper using a novel nonlinear predictor and using a target-directed temporal constraint. The nonlinear predictor is constructed from a parameter-free, discrete mapping function from the formant (frequencies and bandwidths) space to the LPC-cepstral space, with trainable residuals. We examine in this study the key role of vocal tract resonance targets in the tracking accuracy. Experimental results show that due to the use of the targets, the tracked formants in the consonantal regions (including closures and short pauses) of the speech utterance exhibit the same dynamic properties as for the vocalic regions, and reflect the underlying vocal tract resonances. The results also demonstrate the effectiveness of training the prediction-residual parameters and of incorporating the target-based constraint in obtaining high-accuracy formant estimates, especially for non-sonorant portions of speech.

Full Paper

Bibliographic reference.  Deng, Li / Bazzi, Issam / Acero, Alex (2003): "Tracking vocal tract resonances using an analytical nonlinear predictor and a target-guided temporal constraint", In EUROSPEECH-2003, 73-76.