Acoustic-to-Articulatory Inversion Mapping Based on Latent Trajectory Gaussian Mixture Model

Patrick Lumban Tobing, Tomoki Toda, Hirokazu Kameoka, Satoshi Nakamura


A maximum likelihood parameter trajectory estimation based on a Gaussian mixture model (GMM) has been successfully implemented for acoustic-to-articulatory inversion mapping. In the conventional method, GMM parameters are optimized by maximizing a likelihood function for joint static and dynamic features of acoustic-articulatory data, and then, the articulatory parameter trajectories are estimated for given the acoustic data by maximizing a likelihood function for only the static features, imposing a constraint between static and dynamic features to consider the inter-frame correlation. Due to the inconsistency of the training and mapping criterion, the trained GMM is not optimum for the mapping process. This inconsistency problem is addressed within a trajectory training framework, but it becomes more difficult to optimize some parameters, e.g., covariance matrices and mixture component sequences. In this paper, we propose an inversion mapping method based on a latent trajectory GMM (LT-GMM) as yet another way to overcome the inconsistency issue. The proposed method makes it possible to use a well-formulated algorithm, such as EM algorithm, to optimize the LT-GMM parameters, which is not feasible in the traditional trajectory training. Experimental results demonstrate that the proposed method yields higher accuracy in the inversion mapping compared to the conventional GMM-based method.


DOI: 10.21437/Interspeech.2016-1196

Cite as

Tobing, P.L., Toda, T., Kameoka, H., Nakamura, S. (2016) Acoustic-to-Articulatory Inversion Mapping Based on Latent Trajectory Gaussian Mixture Model. Proc. Interspeech 2016, 953-957.

Bibtex
@inproceedings{Tobing+2016,
author={Patrick Lumban Tobing and Tomoki Toda and Hirokazu Kameoka and Satoshi Nakamura},
title={Acoustic-to-Articulatory Inversion Mapping Based on Latent Trajectory Gaussian Mixture Model},
year=2016,
booktitle={Interspeech 2016},
doi={10.21437/Interspeech.2016-1196},
url={http://dx.doi.org/10.21437/Interspeech.2016-1196},
pages={953--957}
}