ISCA Archive SSW 1998
ISCA Archive SSW 1998

Concatenative speech synthesis using a harmonic plus noise model

Yannis Stylianou

This paper describes the application of the Harmonic plus Noise Model, HNM, for concatenative Text-to-Speech (TTS) synthesis. In the context of HNM, speech signals are represented as a time-varying harmonic component plus a modulated noise component. The decomposition of speech signal in these two components allows for more natural-sounding modifications (e.g., source and filter modifications) of the signal. The parametric representation of speech using HNM provides a straightforward way of smoothing discontinuities of acoustic units around concatenation points. Formal listening tests have shown that HNM provides high-quality speech synthesis while outperforming other models for synthesis (e.g., TD-PSOLA) in intelligibility, naturalness and pleasantness.


Cite as: Stylianou, Y. (1998) Concatenative speech synthesis using a harmonic plus noise model. Proc. 3rd ESCA/COCOSDA Workshop on Speech Synthesis (SSW 3), 261-266

@inproceedings{stylianou98_ssw,
  author={Yannis Stylianou},
  title={{Concatenative speech synthesis using a harmonic plus noise model}},
  year=1998,
  booktitle={Proc. 3rd ESCA/COCOSDA Workshop on Speech Synthesis (SSW 3)},
  pages={261--266}
}