EUROSPEECH 2003 - INTERSPEECH 2003
A system for visualisation of the vocal-tract shapes during vowel articulation has been designed and developed. The system generates the vocal tract configuration using a new approach based on extracting both the area functions and the formant frequencies form the acoustic speech signal. Using a linear prediction analysis, the vocal tract area functions and the first three formants are first estimated. The estimated area functions are then mapped to corresponding mid-sagittal distances and displayed as 2D vocal tract lateral graphics. The mapping process is based on a simple numerical algorithm and an accurate reference grid derived from x-rays for the pronunciation of a number English vowels uttered by different speakers. To compensate for possible errors in the estimated area functions due to variations in vocal tract length, the first two section distances are determined by the three formants. The formants are also used to adjust the rounding of the lips and the height of the jawbone. Results show high correlation with x-ray data and the PARAFAC analysis. The system could be useful as a visual sensory aid for speech training of the hearing-impaired.
Bibliographic reference. Mahdi, Abdulhussain E. (2003): "Visualisation of the vocal tract based on estimation of vocal area functions and formant frequencies", In EUROSPEECH-2003, 2381-2384.