ISCA Archive Interspeech 2013
ISCA Archive Interspeech 2013

Statistical synthesizer with embedded prosodic and spectral modifications to generate highly intelligible speech in noise

D. Erro, T. C. Zorilă, Yannis Stylianou, E. Navas, I. Hernaez

This paper describes a statistical parametric speech synthesizer that, despite having been trained on an ordinary synthesis database and without any adaptation data, is able to generate highly intelligible speech in noisy environments. By using a simple and flexible vocoder based on a harmonic model, it applies several noiseindependent modifications to durations, pitch level and range, energy contour, formant sharpness, and intensity of particular spectral bands. The system has been evaluated by means of a large subjective test, the results of which show that the suggested approach clearly outperforms the reference TTS systems and even unmodified natural speech in some conditions


doi: 10.21437/Interspeech.2013-765

Cite as: Erro, D., Zorilă, T.C., Stylianou, Y., Navas, E., Hernaez, I. (2013) Statistical synthesizer with embedded prosodic and spectral modifications to generate highly intelligible speech in noise. Proc. Interspeech 2013, 3557-3561, doi: 10.21437/Interspeech.2013-765

@inproceedings{erro13_interspeech,
  author={D. Erro and T. C. Zorilă and Yannis Stylianou and E. Navas and I. Hernaez},
  title={{Statistical synthesizer with embedded prosodic and spectral modifications to generate highly intelligible speech in noise}},
  year=2013,
  booktitle={Proc. Interspeech 2013},
  pages={3557--3561},
  doi={10.21437/Interspeech.2013-765}
}