11th Annual Conference of the International Speech Communication Association

Makuhari, Chiba, Japan
September 26-30. 2010

Sinusoidal Model Parameterization for HMM-Based TTS System

Slava Shechtman, Alex Sorin

IBM Research, Haifa Research Lab, Israel

A sinusoidal representation of speech is an alternative to the source-filter model. It is widely used in speech coding and unit-selection TTS, but is less common in statistical TTS frameworks. In this work we utilize Regularized Cepstral Coefficients (RCC) estimated in mel-frequency scale for amplitude spectrum envelope modeling within an HMM-based TTS platform. Improved subjective quality for mel-frequency RCC (MRCC) combined with the sinusoidal model based reconstruction is reported, compared to the state-of-the-art MGC-LSP parameters

Full Paper

Bibliographic reference.  Shechtman, Slava / Sorin, Alex (2010): "Sinusoidal model parameterization for HMM-based TTS system", In INTERSPEECH-2010, 805-808.