5th European Conference on Speech Communication and Technology

Rhodes, Greece
September 22-25, 1997

Unified Physiological Model of Audible-Visible Speech Production

Eric Vatikiotis-Bateson, Hani Yehia

ATR Human Information Research Laboratories, Kyoto, Japan

In this paper, vocal tract and orofacial motions are measured during speech production in order to demonstrate that vocal tract motion can be used to estimate its orofacial counterpart. The inversion, i.e. vocal tract behavior estimation from orofacial motion, is also possible, but to a smaller extent. The numerical results showed that vocal tract motion accounted for 96% of the total variance observed in the joint system, whereas orofacial motion accounted for 77%. This analysis is part of a wider study where a dynamical model is being developed to express vocal tract and orofacial motions as a function of muscle activity. This model, currently implemented through multilinear second order autoregressive techniques is described briefly. Finally, the strong direct influence that vocal tract and facial motions have on the energy of the speech acoustics is exemplified.

