11th Annual Conference of the International Speech Communication Association

Makuhari, Chiba, Japan
September 26-30. 2010

Can Tongue Be Recovered from Face? The Answer of Data-Driven Statistical Models

Atef Ben Youssef, Pierre Badin, Gérard Bailly

GIPSA, France

This study revisits the face-to-tongue articulatory inversion problem in speech. We compare the Multi Linear Regression method (MLR) with two more sophisticated methods based on Hidden Markov Models (HMMs) and Gaussian Mixture Models (GMMs), using the same French corpus of articulatory data acquired by ElectroMagnetoGraphy. GMMs give overall results better than HMMs, but MLR does poorly. GMMs and HMMs maintain the original phonetic class distribution, though with some centralisation effects, effects still much stronger with MLR. A detailed analysis shows that, if the jaw / lips / tongue tip synergy helps recovering front high vowels and coronal consonants, the velars are not recovered at all. It is therefore not possible to recover reliably tongue from face.

Full Paper

Bibliographic reference.  Youssef, Atef Ben / Badin, Pierre / Bailly, Gérard (2010): "Can tongue be recovered from face? the answer of data-driven statistical models", In INTERSPEECH-2010, 2002-2005.