16th Annual Conference of the International Speech Communication Association

Dresden, Germany
September 6-10, 2015

On Glottal Source Shape Parameter Transformation Using a Novel Deterministic and Stochastic Speech Analysis and Synthesis System

Stefan Huber, Axel Roebel

IRCAM, France

In this paper we present a flexible deterministic plus stochastic model (DSM) approach for parametric speech analysis and synthesis with high quality. The novelty of the proposed speech processing system lies in its extended means to estimate the unvoiced stochastic component and to robustly handle the transformation of the glottal excitation source. It is therefore well suited as speech system within the context of Voice Transformation and Voice Conversion. The system is evaluated in the context of a voice quality transformation on natural human speech. The voice quality of a speech phrase is altered by means of re-synthesizing the deterministic component with different pulse shapes of the glottal excitation source. A subjective listening test suggests that the speech processing system is able to successfully synthesize and arise to a listener the perceptual sensation of different voice quality characteristics. Additionally, improvements of the speech synthesis quality compared to a baseline method are demonstrated.

Full Paper     Acoustic Examples

