INTERSPEECH 2004 - ICSLP
Voice conversion (VC) can be seen as a powerful technology for customizing Text-to-Speech (TTS) systems. This paper deals with the integration of a VC method based on Gaussian Mixture Model (GMM) in a TTS system. In this framework, an algorithm that enables complexity reduction of the VC processing is proposed. The main idea is to restrict the conversion function to the most representative components of the GMM for each frame and, if necessary, to store the component indices and their associated weights in the acoustic dictionary. This method is evaluated by comparison to a classical GMM-based transformation function. Tests show that both methods yield comparable results. Furthermore, additional experiments indicate that this new technique leads to a significant decrease of the computational load involved in the conversion process.
Bibliographic reference. En-Najjary, Taoufik / Rosec, Olivier / Chonavel, Thierry (2004): "Fast GMM-based voice conversion for text-to-speech synthesis systems", In INTERSPEECH-2004, 1229-1232.