8th International Conference on Spoken Language Processing

Jeju Island, Korea
October 4-8, 2004

DFW-based Spectral Smoothing for Concatenative Speech Synthesis

Hartmut R. Pfitzinger

University of Munich, Germany

A new spectral smoothing technique is proposed and evaluated. Its performance is comparable with LSP interpolation in terms of Euclidean spectral distance measurements but its interpolated formant trajectories are more reasonable from a phonetic point of view. The approach firstly estimates derivative logarithmic magnitude spectra from both the source and the target frame represented by autoregressive filter coefficients. Then, Dynamic Programming yields the best alignment between these two spectral representations. Smoothed frequency responses are achieved by weighted linear interpolation between the corresponding source and target spectral lines whose alignment was found by DP backtracking. Finally, the spectrum is converted to autoregressive filter coefficients with the intermediate stage of autocorrelation coefficients.

Full Paper

Bibliographic reference.  Pfitzinger, Hartmut R. (2004): "DFW-based spectral smoothing for concatenative speech synthesis", In INTERSPEECH-2004, 1397-1400.