Synchronization of speech frames is an important issue in a concatenative speech synthesis system. In terms of signal processing this is translated in removing linear phase mismatches between concatenated speech frames. This paper presents two novel approaches to the problem of synchronization of speech frames with an application to concatenative speech synthesis. Both methods are based on a processing of phase spectra without decreasing the quality of the input speech, in contrast to previously proposed methods. The first method is based on the notion of center of gravity and the second on differentiated phase data. The proposed methods have been tested with the Harmonic plus Noise Model, HNM, in the context of Textto-Speech synthesis. The resulting synthetic speech is free of linear phase mismatches.
Cite as: Stylianou, Y. (1999) Synchronization of speech frames based on phase data with application to concatenative speech synthesis. Proc. 6th European Conference on Speech Communication and Technology (Eurospeech 1999), 2343-2346, doi: 10.21437/Eurospeech.1999-512
@inproceedings{stylianou99_eurospeech, author={Yannis Stylianou}, title={{Synchronization of speech frames based on phase data with application to concatenative speech synthesis}}, year=1999, booktitle={Proc. 6th European Conference on Speech Communication and Technology (Eurospeech 1999)}, pages={2343--2346}, doi={10.21437/Eurospeech.1999-512} }