ISCA Tutorial and Research Workshop on Statistical and Perceptual Audition (SAPA2008)

Brisbane, Australia
September 21, 2008

Data-Driven Articulatory Inversion Incorporating Articulator Priors

Adam Lammert (1), Daniel P. W. Ellis (2), Pierre Divenyi (1)

(1) EBIRE, Martinez, CA, USA; (2) Columbia University, New York, NY, USA

Recovering the motions of speech articulators from the acoustic speech signal has a long history, starting from the observation that a simple concatenated tube model is a reasonable model for the origin of formant resonances. In this work, we take a different approach making minimal assumptions about the interdependence of acoustics and articulators by estimating the full joint distribution of the two spaces based on a corpus of paired data, derived from an articulatory synthesizer. This approach allows us to estimate posterior distributions of articulator state as well as finding the maximum-likelihood trajectories. We present examples comparing this approach to a related, earlier approach that did not incorporate prior distributions over articulator space, and demonstrate the advantages of learning the models from realistic utterances. We also indicate benefits available from jointly estimating particular pairs of articulators that have high mutual dependence.

Full Paper

Bibliographic reference.  Lammert, Adam / Ellis, Daniel P. W. / Divenyi, Pierre (2008): "Data-driven articulatory inversion incorporating articulator priors", In SAPA-2008, 29-34.