ISCA Archive Interspeech 2005
ISCA Archive Interspeech 2005

Model based analysis of a diphone database for improved unit concatenation

Karl Schnell, Arild Lacroix

One crucial point of concatenation approaches using diphones is to handle the discontinuities between the concatenated units. This problem is treated by a suitable analysis of the diphones for a parametric synthesis. The model of the parametric synthesis is the lossy tube model, which is an extension of the standard lattice filter considering frequency dependent vocal tract losses. The parameters of the tube model are estimated from diphones by an optimization algorithm. The discontinuities of the model parameters at the diphone joints decrease the quality of the synthesis results. To reduce the mismatch of the parameter configurations at the diphone boundaries a specific analysis of a diphone database is proposed, analyzing each diphone with respect to other diphones containing the phonemes of the respective diphone. The parameter mismatches at the diphone joints are reduced improving the concatenation results considerably.


doi: 10.21437/Interspeech.2005-806

Cite as: Schnell, K., Lacroix, A. (2005) Model based analysis of a diphone database for improved unit concatenation. Proc. Interspeech 2005, 2605-2608, doi: 10.21437/Interspeech.2005-806

@inproceedings{schnell05_interspeech,
  author={Karl Schnell and Arild Lacroix},
  title={{Model based analysis of a diphone database for improved unit concatenation}},
  year=2005,
  booktitle={Proc. Interspeech 2005},
  pages={2605--2608},
  doi={10.21437/Interspeech.2005-806}
}