A Flexible Spectral Modification Method Based on Temporal Decomposition and Gaussian Mixture Model

Binh Phu Nguyen, Masato Akagi

JAIST, Japan

This paper presents a new spectral modification method to solve two drawbacks of conventional spectral modification methods, insufficient smoothness of the modified spectra between frames and ineffective spectral modification. To overcome the insufficient smoothness, a speech analysis technique called temporal decomposition (TD) is used to model the spectral evolution. Instead of modifying the speech spectra frame by frame, we only need to modify event targets and event functions, and the smoothness of the modified speech is ensured by the shape of the event functions. To overcome the ineffective spectral modification, we explore Gaussian mixture model (GMM) parameters for an input of TD to model the spectral envelope, and develop a new method of modifying GMM parameters in accordance with formant scaling factors. Experimental results show that the effectiveness of the proposed method is verified in terms of the smoothness of the modified speech and the effective spectral modification.

Bibliographic reference.  Nguyen, Binh Phu / Akagi, Masato (2007): "A flexible spectral modification method based on temporal decomposition and Gaussian mixture model", In INTERSPEECH-2007, 538-541.