12th Annual Conference of the International Speech Communication Association

Florence, Italy
August 27-31. 2011

Improved HNM-Based Vocoder for Statistical Synthesizers

Daniel Erro, Iñaki Sainz, Eva Navas, Inma Hernáez

Universidad del País Vasco, Spain

Statistical parametric synthesizers have achieved very good performance scores during the last years. Nevertheless, as they require the use of vocoders to parameterize speech (during training) and to reconstruct waveforms (during synthesis), the speech generated from statistical models lacks some degree of naturalness. In previous works we explored the usefulness of the harmonics plus noise model in the design of a high-quality speech vocoder. Quite promising results were achieved when this vocoder was integrated into a synthesizer. In this paper, we describe some recent improvements related to the excitation parameters, particularly the so called maximum voiced frequency. Its estimation and explicit modelling leads to an even better synthesis performance as confirmed by subjective comparisons with other well-known methods.

Full Paper

Bibliographic reference.  Erro, Daniel / Sainz, Iñaki / Navas, Eva / Hernáez, Inma (2011): "Improved HNM-based vocoder for statistical synthesizers", In INTERSPEECH-2011, 1809-1812.