This paper presents a novel speech modification method capable of controlling unobservable articulatory parameters based on a statistical feature mapping technique with Gaussian Mixture Models (GMMs). In previous work , the GMM-based statistical feature mapping was successfully applied to acoustic-to-articulatory inversion mapping and articulatory-to-acoustic production mapping separately. In this paper, these two mapping frameworks are integrated to a unified framework to develop a novel speech modification system. The proposed system sequentially performs the inversion and the production mapping, making it possible to modify phonemic sounds of an input speech signal by intuitively manipulating articulatory parameters estimated from the input speech signal. We also propose a manipulation method to automatically compensate for unmodified articulatory movements considering inter-dimensional correlation of the articulatory parameters. The proposed system is implemented for a single English speaker and its effectiveness is evaluated experimentally. The experimental results demonstrate that the proposed system is capable of modifying phonemic sounds by manipulating the estimated articulatory movements and higher speech quality is achieved by considering the inter-dimensional correlation in the manipulation.
Bibliographic reference. Tobing, Patrick Lumban / Toda, Tomoki / Neubig, Graham / Sakti, Sakriani / Nakamura, Satoshi / Purwarianti, Ayu (2014): "Articulatory controllable speech modification based on statistical feature mapping with Gaussian mixture models", In INTERSPEECH-2014, 2298-2302.