Locally Linear Embedding for Exemplar-Based Spectral Conversion

Yi-Chiao Wu, Hsin-Te Hwang, Chin-Cheng Hsu, Yu Tsao, Hsin-Min Wang

This paper describes a novel exemplar-based spectral conversion (SC) system developed by the AST (Academia Sinica, Taipei) team for the 2016 voice conversion challenge (vcc2016). The key feature of our system is that it integrates the locally linear embedding (LLE) algorithm, a manifold learning algorithm that has been successfully applied for the super-resolution task in image processing, with the conventional exemplar-based SC method. To further improve the quality of the converted speech, our system also incorporates (1) the maximum likelihood parameter generation (MLPG) algorithm, (2) the postfiltering-based global variance (GV) compensation method, and (3) a high-resolution feature extraction process. The results of subjective evaluation conducted by the vcc2016 organizer show that our LLE-exemplar-based SC system notably outperforms the baseline GMM-based system (implemented by the vcc2016 organizer). Moreover, our own internal evaluation results confirm the effectiveness of the major LLE-exemplar-based SC method and the three additional approaches with improved speech quality.

DOI: 10.21437/Interspeech.2016-567

Cite as

Wu, Y., Hwang, H., Hsu, C., Tsao, Y., Wang, H. (2016) Locally Linear Embedding for Exemplar-Based Spectral Conversion. Proc. Interspeech 2016, 1652-1656.

author={Yi-Chiao Wu and Hsin-Te Hwang and Chin-Cheng Hsu and Yu Tsao and Hsin-Min Wang},
title={Locally Linear Embedding for Exemplar-Based Spectral Conversion},
booktitle={Interspeech 2016},