Spectral smoothing for concatenative speech synthesis

David T. Chappell, John H. L. Hansen

This paper addresses the topic of performing effective concatenative speech synthesis with a limited database by proposing methods to smooth the transitions between speech segments. The objective is to produce natural-sounding speech via segment concatenation when formants and other spectral features do not align properly. We propose several methods for adjusting the spectra between waveform segments selected for concatenation. Techniques examined include optimal coupling, waveform interpolation, linear predictive pole shifting, and psychoacoustic closure. Several of these algorithms have been previously developed for either coding or synthesis, but our application of closure for segment processing is novel. After spectral smoothing, the final synthesized speech can better approximate the desired speech characteristics and is continuous in both the time domain and spectral structure.

doi: 10.21437/ICSLP.1998-19

Cite as: Chappell, D.T., Hansen, J.H.L. (1998) Spectral smoothing for concatenative speech synthesis. Proc. 5th International Conference on Spoken Language Processing (ICSLP 1998), paper 0849, doi: 10.21437/ICSLP.1998-19

