ISCA Archive ICSLP 2000
ISCA Archive ICSLP 2000

Unit fusion for concatenative speech synthesis

Johan Wouters, Michael W. Macon

An important problem in concatenative synthesis is the occurence of spectral discontinuities or "concatenation mismatch" between sonorant speech units. In this paper, we present an approach to reduce concatenation mismatch by combining spectral information from two sequences of speech units selected in parallel. Concatenation units, on one hand, define initial spectral trajectories for a target utterance. Fusion units, on the other hand, define the desired transitions between concatenated units. The two unit sequences are "fused" by imposing dynamic constraints defined by the fusion units on the spectral trajectories of the concatenation units. To regenerate the modified speech units, we use a synthesis algorithm based on sinusoidal + all-pole analysis of speech, which overcomes the limitations of residual-excited LPC. Results from a perceptual test show that our method is highly successful at removing concatenation artifacts in speech generated from an inventory of diphones.

doi: 10.21437/ICSLP.2000-536

Cite as: Wouters, J., Macon, M.W. (2000) Unit fusion for concatenative speech synthesis. Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000), vol. 3, 302-305, doi: 10.21437/ICSLP.2000-536

  author={Johan Wouters and Michael W. Macon},
  title={{Unit fusion for concatenative speech synthesis}},
  booktitle={Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000)},
  pages={vol. 3, 302-305},