ISCA Archive ICSLP 2000
ISCA Archive ICSLP 2000

A solution to the reduction of concatenation artefacts in speech synthesis

Esther Klabbers, Raymond Veldhuis, Kim Koppen

One problem with speech synthesis impeding high quality is the occurrence of audible discontinuities at segment boundaries. Formant jumps across concatenation points suggest the problem to be due to spectral differences. The problem is most apparent in vowels and semi-vowels. We propose to reduce the number of audible discontinuities by adding context-sensitive diphones to the database. The number of additional diphones is limited by clustering contexts with similar spectral effects on the neighbouring vowels, using the Kullback-Leibler distance. A listening experiment has shown that the percentage of perceived discontinuities has significantly decreased.


Cite as: Klabbers, E., Veldhuis, R., Koppen, K. (2000) A solution to the reduction of concatenation artefacts in speech synthesis. Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000), vol. 3, 474-477

@inproceedings{klabbers00_icslp,
  author={Esther Klabbers and Raymond Veldhuis and Kim Koppen},
  title={{A solution to the reduction of concatenation artefacts in speech synthesis}},
  year=2000,
  booktitle={Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000)},
  pages={vol. 3, 474-477}
}