EUROSPEECH 2003 - INTERSPEECH 2003
This paper presents an automatic speech segmentation method based on HMM alignment and a categorized multiple-expert fine adjustment. The accuracy of syllable boundaries is significantly improved (72.8% and 51.9% for starting and ending boundaries of syllables, respectively) after the fine adjustment. Moreover, a novel phonetic verification method for checking inconsistency between text script and recorded speech are also proposed. Design and performance of confidence measures for both segmentation and verification are described, which manifests the automatic detection of problematic speech segments can be achieved. These methods together largely reduce human labor in construction of our new corpus-based TTS system.
Bibliographic reference. Kuo, Chih-Chung / Kuo, Chi-Shiang / Chen, Jau-Hung / Chang, Sen-Chia (2003): "Automatic speech segmentation and verification for concatenative synthesis", In EUROSPEECH-2003, 305-308.