The Seventh ISCA Tutorial and Research Workshop on Speech Synthesis

Kyoto, Japan
September 22-24, 2010

An HMM-based Singing Style Modeling System for Singing Voice Synthesizers

Keijiro Saino, Makoto Tachibana, Hideki Kenmochi

Corporate Research & Development Center, Yamaha Corporation, Japan

This paper describes a method of modeling singing styles by a statistical method. In this system, singing expression parameters consisting of melody and dynamics which are derived from fundamental frequency (F0) and power are modeled by context-dependent Hidden Markov Models (HMMs.) A modeling method of the parameters is optimized for dealing with them. Since parameters we focus on are general ones for singing synthesizers, generated parameters from the trained models may be applicable to many of them. As a result, parameters which can produce an “expressive” synthesis sound are automatically generated from trained models using score data of arbitrary songs. We trained singing style models in the experiment by using recorded singing voice with a much expressive style. Parameters generated for songs not included in training data were applied to our singing synthesizer VOCALOID. As a result, the style was well perceived in the synthesized sound with enough naturalness.

Index Terms: singing voice synthesis, singing style, HMM

Full Paper

Bibliographic reference.  Saino, Keijiro / Tachibana, Makoto / Kenmochi, Hideki (2010): "An HMM-based singing style modeling system for singing voice synthesizers", In SSW7-2010, 252-257.