Sixth International Conference on Spoken Language Processing
(ICSLP 2000)

Beijing, China
October 16-20, 2000

A Solution to the Reduction of Concatenation Artefacts in Speech Synthesis

Esther Klabbers, Raymond Veldhuis, Kim Koppen

IPO, Center for User-System Interaction, Eindhoven, The Netherlands

One problem with speech synthesis impeding high quality is the occurrence of audible discontinuities at segment boundaries. Formant jumps across concatenation points suggest the problem to be due to spectral differences. The problem is most apparent in vowels and semi-vowels. We propose to reduce the number of audible discontinuities by adding context-sensitive diphones to the database. The number of additional diphones is limited by clustering contexts with similar spectral effects on the neighbouring vowels, using the Kullback-Leibler distance. A listening experiment has shown that the percentage of perceived discontinuities has significantly decreased.


Full Paper

Bibliographic reference.  Klabbers, Esther / Veldhuis, Raymond / Koppen, Kim (2000): "A solution to the reduction of concatenation artefacts in speech synthesis", In ICSLP-2000, vol.3, 474-477.