ISCA Archive SSW 2010
ISCA Archive SSW 2010

Two vocoder techniques for neutral to emotional timbre conversion

Fabio Tesser, Enrico Zovato, Mauro Nicolao, Piero Cosi

In this paper, we describe the application of two vocoder techniques for an experiment of spectral envelope transformation. We processed speech data in a neutral standard reading style in order to reproduce the spectral shapes of two emotional speaking styles: happy and sad. This was achieved by means of conversion functions which operate in the frequency domain and are trained with aligned source-target pairs of spectral features. The first vocoder is based on the source-filter model of speech production and exploits the Mel Log Spectral Approximation filter, while the second is the Phase vocoder. Objective distance measures were calculated in order to evaluate the effectiveness of the conversion framework in predicting the target spectral envelopes. Subjective listening tests also provided interesting elements for the evaluation.

Index Terms: emotional speech, spectral transformation, GMM, mel-cepstral analysis, phase vocoder, MLSA filter


Cite as: Tesser, F., Zovato, E., Nicolao, M., Cosi, P. (2010) Two vocoder techniques for neutral to emotional timbre conversion. Proc. 7th ISCA Workshop on Speech Synthesis (SSW 7), 130-135

@inproceedings{tesser10_ssw,
  author={Fabio Tesser and Enrico Zovato and Mauro Nicolao and Piero Cosi},
  title={{Two vocoder techniques for neutral to emotional timbre conversion}},
  year=2010,
  booktitle={Proc. 7th ISCA Workshop on Speech Synthesis (SSW 7)},
  pages={130--135}
}