12th Annual Conference of the International Speech Communication Association

Florence, Italy
August 27-31. 2011

Decision Tree-Based Clustering with Outlier Detection for HMM-Based Speech Synthesis

Kyung Hwan Oh, June Sig Sung, Doo Hwa Hong, Nam Soo Kim

Seoul National University, Korea

In order to express natural prosodic variations in continuous speech, sophisticated speech units such as the context-dependent phone models are usually employed in HMM-based speech synthesis techniques. Since the training database cannot practically cover all possible context factors, decision tree-based HMM states clustering is commonly applied. One of the serious problems in a decision tree-based method is that the criterion used for node splitting and stopping is sensitive to irrelevant outlier data. In this paper, we propose a novel approach to removing outliers during the decision tree growing phase. Experimental results show that removing of outlying models improves the quality of the synthesized speech, especially for sentences which originally demonstrated poor quality.

Full Paper

Bibliographic reference.  Oh, Kyung Hwan / Sung, June Sig / Hong, Doo Hwa / Kim, Nam Soo (2011): "Decision tree-based clustering with outlier detection for HMM-based speech synthesis", In INTERSPEECH-2011, 101-104.