11th Annual Conference of the International Speech Communication Association

Makuhari, Chiba, Japan
September 26-30. 2010

Maximum a posteriori Voice Conversion Using Sequential Monte Carlo Methods

Elina Helander (1), Hanna Silén (1), Joaquin Míguez (2), Moncef Gabbouj (1)

(1) Tampere University of Technology, Finland
(2) Universidad Carlos III de Madrid, Spain

Many voice conversion algorithms are based on frame-wise mapping from source features into target features. This ignores the inherent temporal continuity that is present in speech and can degrade the subjective quality. In this paper, we propose to optimize the speech feature sequence after a frame-based conversion algorithm has been applied. In particular, we select the sequence of speech features through the minimization of a cost function that involves both the conversion error and the smoothness of the sequence. The estimation problem is solved using sequential Monte Carlo methods. Both subjective and objective results show the effectiveness of the method.

Full Paper

Bibliographic reference.  Helander, Elina / Silén, Hanna / Míguez, Joaquin / Gabbouj, Moncef (2010): "Maximum a posteriori voice conversion using sequential monte carlo methods", In INTERSPEECH-2010, 1716-1719.