5th International Conference on Spoken Language Processing

Sydney, Australia
November 30 - December 4, 1998

Speech Production Of Vowel Sequences Using A Physiological Articulatory Model

Jianwu Dang, Kiyoshi Honda

ATR Human Information Processing Research Labs, Japan

This report describes the development of a physiologically-based articulatory model, which consists of the tongue, mandible, hyoid bone and vocal tract wall. These organs are represented as a midsagittal quasi-3D layer with a thickness of 2 cm for tongue tissue and 3 cm for tract wall. The geometry of these organs and muscles are extracted from volumetric MR images of a male speaker. Both the soft and rigid structures are represented by mass-points and viscoelastic springs for connective tissue, where the springs for bony organs are set to extremely large stiffness. This design is suitable to compute soft tissue deformations and rigid organ displacements simultaneously using a single algorithm, and thus reduces computational complexities of the simulation. A novel control method is developed to produce dynamic actions of the vocal tract, as well as to handle the collision of the tongue to surrounding walls. Area functions are obtained for vowel sequences based on model's vocal tract widths in the midsagittal and parasagittal planes. The proposed model demonstrated plausible dynamic behaviors for human speech articulation.

Full Paper

Bibliographic reference.  Dang, Jianwu / Honda, Kiyoshi (1998): "Speech production of vowel sequences using a physiological articulatory model", In ICSLP-1998, paper 0639.