10th Annual Conference of the International Speech Communication Association

Brighton, United Kingdom
September 6-10, 2009

Acoustic-to-Articulatory Inversion Using Speech Recognition and Trajectory Formation Based on Phoneme Hidden Markov Models

Atef Ben Youssef, Pierre Badin, Gérard Bailly, Panikos Heracleous

GIPSA, France

In order to recover the movements of usually hidden articulators such as tongue or velum, we have developed a data-based speech inversion method. HMMs are trained, in a multistream framework, from two synchronous streams: articulatory movements measured by EMA, and MFCC + energy from the speech signal. A speech recognition procedure based on the acoustic part of the HMMs delivers the chain of phonemes and together with their durations, information that is subsequently used by a trajectory formation procedure based on the articulatory part of the HMMs to synthesise the articulatory movements. The RMS reconstruction error ranged between 1.1 and 2. mm.

Full Paper

Bibliographic reference.  Youssef, Atef Ben / Badin, Pierre / Bailly, Gérard / Heracleous, Panikos (2009): "Acoustic-to-articulatory inversion using speech recognition and trajectory formation based on phoneme hidden Markov models", In INTERSPEECH-2009, 2255-2258.