Interspeech'2005 - Eurospeech
One crucial point of concatenation approaches using diphones is to handle the discontinuities between the concatenated units. This problem is treated by a suitable analysis of the diphones for a parametric synthesis. The model of the parametric synthesis is the lossy tube model, which is an extension of the standard lattice filter considering frequency dependent vocal tract losses. The parameters of the tube model are estimated from diphones by an optimization algorithm. The discontinuities of the model parameters at the diphone joints decrease the quality of the synthesis results. To reduce the mismatch of the parameter configurations at the diphone boundaries a specific analysis of a diphone database is proposed, analyzing each diphone with respect to other diphones containing the phonemes of the respective diphone. The parameter mismatches at the diphone joints are reduced improving the concatenation results considerably.
Bibliographic reference. Schnell, Karl / Lacroix, Arild (2005): "Model based analysis of a diphone database for improved unit concatenation", In INTERSPEECH-2005, 2605-2608.