A conventional HMM-based TTS system for Hanoi Vietnamese often suffers from the hoarse quality due to the incomplete F0 parameterization of glottalized tones. As estimating F0 in glottalization is rather problematic for usual F0 extractors, we propose a pitch marking algorithm where the pitch marks are propagated from regular regions of speech signal to glottalized one, from which the complete F0 contour of a glottalized tone is derived. The proposed F0 parameterization scheme was confirmed to significantly reduce the hoarseness whilst improve the tone naturalness of synthetic speech by both objective and listening tests. The pitch marking algorithm works as a refinement step based on the results of an F0 extractor. Therefore, the proposed scheme can be combined with any F0 extractor.
Bibliographic reference. Ninh, Duy Khanh / Yamashita, Yoichi (2015): "F0 parameterization of glottalized tones for HMM-based vietnamese TTS", In INTERSPEECH-2015, 2202-2206.