Eighth ISCA Workshop on Speech Synthesis

Barcelona, Catalonia, Spain
August 31-September 2, 2013

Mage - Reactive Articulatory Feature Control of HMM-based Parametric Speech Synthesis

Maria Astrinaki (1), Alexis Moinet (1), Junichi Yamagishi (2,3), Korin Richmond (3), Zhen-Hua Ling (4), Simon King (3), Thierry Dutoit (1)

(1) TCTS Lab., Numediart Institute, University of Mons, Belgium
(2) National Institute of Informatics, Tokyo, Japan
(3) University of Edinburgh, UK
(4) University of Science and Technology of China (USTC), China

In this paper, we present the integration of articulatory control into MAGE, a framework for realtime and interactive (reactive) parametric speech synthesis using hidden Markov models (HMMs). MAGE is based on the speech synthesis engine from HTS and uses acoustic features (spectrum and f0) to model and synthesize speech. In this work, we replace the standard acoustic models with models combining acoustic and articulatory features, such as tongue, lips and jaw positions. We then use feature-space-switched articulatory-to-acoustic regression matrices to enable us to control the spectral acoustic features by manipulating the articulatory features. Combining this synthesis model with MAGE allows us to interactively and intuitively modify phones synthesized in real time, for example transforming one phone into another, by controlling the configuration of the articulators in a visual display. Index Terms— Voice conversion, exemplar, non-negative matrix factorization, non-negative matrix deconvolution, temporal information

Full Paper

Bibliographic reference.  Astrinaki, Maria / Moinet, Alexis / Yamagishi, Junichi / Richmond, Korin / Ling, Zhen-Hua / King, Simon / Dutoit, Thierry (2013): "Mage - reactive articulatory feature control of HMM-based parametric speech synthesis", In SSW8, 207-211.