A sinusoidal representation of speech is an alternative to the source-filter model. It is widely used in speech coding and unit-selection TTS, but is less common in statistical TTS frameworks. In this work we utilize Regularized Cepstral Coefficients (RCC) estimated in mel-frequency scale for amplitude spectrum envelope modeling within an HMM-based TTS platform. Improved subjective quality for mel-frequency RCC (MRCC) combined with the sinusoidal model based reconstruction is reported, compared to the state-of-the-art MGC-LSP parameters
Bibliographic reference. Shechtman, Slava / Sorin, Alex (2010): "Sinusoidal model parameterization for HMM-based TTS system", In INTERSPEECH-2010, 805-808.