INTERSPEECH 2015
16th Annual Conference of the International Speech Communication Association

Dresden, Germany
September 6-10, 2015

Phase Perception of the Glottal Excitation of Vocoded Speech

Tuomo Raitio (1), Lauri Juvela (1), Antti Suni (2), Martti Vainio (2), Paavo Alku (1)

(1) Aalto University, Finland
(2) University of Helsinki, Finland

While the characteristics of the amplitude spectrum of the voiced excitation have been studied widely both in natural and synthetic speech, the role of the excitation phase has remained less explored. Especially in speech synthesis, the phase information is often omitted for simplicity. This study investigates the impact of phase information of the excitation signal of voiced speech. The experiments in the study involve analysis-synthesis of speech using a vocoder that utilizes natural glottal flow pulses for reconstructing the voiced excitation. Firstly, the phase spectra of the glottal flow waveforms are converted to either zero-phase or random-phase. Secondly, the quality of vocoded speech using the two phase-modified pulses is compared in subjective listening tests to the corresponding signal excited with the natural-phase pulse. The results indicate that phase has a perceptually relevant effect in vocoded speech and the use of natural phase improves the synthesis quality.

Full Paper

Bibliographic reference.  Raitio, Tuomo / Juvela, Lauri / Suni, Antti / Vainio, Martti / Alku, Paavo (2015): "Phase perception of the glottal excitation of vocoded speech", In INTERSPEECH-2015, 254-258.