14thAnnual Conference of the International Speech Communication Association

Lyon, France
August 25-29, 2013

Lombard Modified Text-to-Speech Synthesis for Improved Intelligibility: Submission for the Hurricane Challenge 2013

Antti Suni (1), Reima Karhila (2), Tuomo Raitio (2), Mikko Kurimo (2), Martti Vainio (1), Paavo Alku (2)

(1) University of Helsinki, Finland
(2) Aalto University, Finland

This paper describes modification of a TTS system for improving the intelligibility of speech in various noise conditions. First, the GlottHMM vocoder is used for training a voice with modal speech data. The vocoder and voice parameters are then modified to mimic the properties of Lombard effect based on a small amount of Lombard speech from the same speaker. More specifically, the durations are increased, fundamental frequency is raised, spectral tilt is decreased, the harmonic-to-noise ratio is increased, and a pressed glottal flow pulses are used in creating excitation. The formants of the speech are also enhanced and finally the speech is compressed in order to increase noise robustness of the voice. The evaluation results of the Hurricane Challenge 2013 indicate that the modified voice is mostly less intelligible than the unmodified natural speech, as expected, but more intelligible than the reference TTS voice, especially in the low SNR conditions.

Full Paper

Bibliographic reference.  Suni, Antti / Karhila, Reima / Raitio, Tuomo / Kurimo, Mikko / Vainio, Martti / Alku, Paavo (2013): "Lombard modified text-to-speech synthesis for improved intelligibility: submission for the hurricane challenge 2013", In INTERSPEECH-2013, 3562-3566.