HMM based speech synthesis, where speech parameters are generated directly from HMM models, is a new technique relative to other speech synthesis techniques. In this paper, we propose some modifications to the basic system to improve its quality. We apply a multi-band excitation model. And we use samples extracted from the spectral envelop as spectral parameters. In the synthesis, the voiced and unvoiced speech parts are mixed according to bands voicing parameters. The voiced part is generated based on a harmonic sinusoidal model. Experimental tests performed on Arabic dataset show that the applied modifications improved the quality.
Cite as: Abdel-Hamid, O., Abdou, S.M., Rashwan, M. (2006) Improving Arabic HMM based speech synthesis quality. Proc. Interspeech 2006, paper 1693-Tue3BuP.10, doi: 10.21437/Interspeech.2006-390
@inproceedings{abdelhamid06_interspeech, author={Ossama Abdel-Hamid and Sherif Mahdy Abdou and Mohsen Rashwan}, title={{Improving Arabic HMM based speech synthesis quality}}, year=2006, booktitle={Proc. Interspeech 2006}, pages={paper 1693-Tue3BuP.10}, doi={10.21437/Interspeech.2006-390} }