ML Parameter Generation with a Reformulated MGE Training Criterion — Participation in the Voice Conversion Challenge 2016

D. Erro, A. Alonso, L. Serrano, D. Tavarez, I. Odriozola, Xabier Sarasola, Eder del Blanco, J. Sanchez, I. Saratxaga, Eva Navas, Inma Hernaez


This paper describes our entry to the Voice Conversion Challenge 2016. Based on the maximum likelihood parameter generation algorithm, the method is a reformulation of the minimum generation error training criterion. It uses a GMM for soft classification, a Mel-cepstral vocoder for acoustic analysis and an improved dynamic time warping procedure for source-target alignment. To compensate the oversmoothing effect, the generated parameters are filtered through a speaker-independent postfilter implemented as a linear transform in cepstral domain. The process is completed with mean and variance adaptation of the log- fundamental frequency and duration modification by a constant factor. The results of the evaluation show that the proposed system achieves a high conversion accuracy in comparison with other systems, while its naturalness scores are intermediate.


DOI: 10.21437/Interspeech.2016-219

Cite as

Erro, D., Alonso, A., Serrano, L., Tavarez, D., Odriozola, I., Sarasola, X., Blanco, E.d., Sanchez, J., Saratxaga, I., Navas, E., Hernaez, I. (2016) ML Parameter Generation with a Reformulated MGE Training Criterion — Participation in the Voice Conversion Challenge 2016. Proc. Interspeech 2016, 1662-1666.

Bibtex
@inproceedings{Erro+2016,
author={D. Erro and A. Alonso and L. Serrano and D. Tavarez and I. Odriozola and Xabier Sarasola and Eder del Blanco and J. Sanchez and I. Saratxaga and Eva Navas and Inma Hernaez},
title={ML Parameter Generation with a Reformulated MGE Training Criterion — Participation in the Voice Conversion Challenge 2016},
year=2016,
booktitle={Interspeech 2016},
doi={10.21437/Interspeech.2016-219},
url={http://dx.doi.org/10.21437/Interspeech.2016-219},
pages={1662--1666}
}