11th Annual Conference of the International Speech Communication Association

Makuhari, Chiba, Japan
September 26-30. 2010

Speaker and Language Adaptive Training for HMM-Based Polyglot Speech Synthesis

Heiga Zen

Toshiba Research Europe Ltd., UK

This paper proposes a technique for speaker and language adaptive training for HMM-based polyglot speech synthesis. Language-specific context-dependencies in the system are captured using CAT with cluster-dependent decision trees. Acoustic variations caused by speaker characteristics are handled by CMLLR-based transforms. This framework allows multi-speaker/multi-language adaptive training and synthesis to be performed. Experimental results show that the proposed technique achieves better synthesis performance than both speaker-adaptively trained language-dependent and language-independent models.

Full Paper

Bibliographic reference.  Zen, Heiga (2010): "Speaker and language adaptive training for HMM-based polyglot speech synthesis", In INTERSPEECH-2010, 410-413.