This paper reports preliminary results of our effort to address the acoustic-to-articulatory inversion problem. We tested an approach that simulates speech production acquisition as a distal learning task, with acoustic signals of natural utterances in the form of MFCC as input, VocalTractLab . a 3D articulatory synthesizer controlled by target approximation models as the learner, and stochastic gradient descent as the training method. The approach was tested on a number of natural utterances, and the results were highly encouraging.
Bibliographic reference. Prom-on, Santitham / Birkholz, Peter / Xu, Yi (2013): "Training an articulatory synthesizer with continuous acoustic data", In INTERSPEECH-2013, 349-353.