The Seventh ISCA Tutorial and Research Workshop on Speech Synthesis
Seeing that speakers increase the intensity of their voice when speaking in loud noise (Lombard effect), this paper proposes a speech transformation approach to mimic this Lombard effect for improving the intelligibility of speech in noisy environments. The approach attempts to simulate the variations of duration, formant frequencies, formant bandwidth, fundamental frequency (F0), and energy in each frequency band due to Lombard effect by using a speech manipulation system STRAIGHT and three models of controlling three acoustic features: fundamental frequency (F0) contour, phoneme duration and spectrum. Different from other manipulation methods, this approach simultaneously modified these acoustic features in time-frequency representation of speech. This approach was evaluated by comparing the synthesized Lombard speech and noise-free Lombard speech in terms of similarity, naturalness and voice quality. The experimental results show that the proposed system is able to convert the neutral speech into Lombard speech in the quality very close to the natural Lombard speech.
Index Terms: speech transformation, Lombard effect, duration, fundamental frequency, spectrum.
Bibliographic reference. Huang, Dong-Yan / Rahardja, Susanto / Ong, Ee Ping (2010): "Lombard effect mimicking", In SSW7-2010, 258-263.