This paper presents a method to produce a new vowel by articulatory control in hidden Markov model (HMM) based parametric speech synthesis. A multiple regression HMM (MRHMM) is adopted to model the distribution of acoustic features, with articulatory features used as external auxiliary variables. The dependency between acoustic and articulatory features is modelled by a group of linear transforms that are either estimated context-dependently or determined by the distribution of articulatory features. Vowel identity is removed from the set of context features used to ensure compatibility between the context-dependent model parameters and the articulatory features of a new vowel. At synthesis time, acoustic features are predicted according to the input articulatory features as well as context information. With an appropriate articulatory feature sequence, a new vowel can be generated even when it does not exist in the training set. Experimental results show this method is effective in creating the English vowel [ʌ] by articulatory control without using any acoustic samples of this vowel.
Index Terms: Speech synthesis, articulatory features, multiple-regression hidden Markov model
Bibliographic reference. Ling, Zhen-Hua / Richmond, Korin / Yamagishi, Junichi (2012): "Vowel creation by articulatory control in HMM-based parametric speech synthesis", In INTERSPEECH-2012, 991-994.