This paper describes a classification technique for emotional expressions and speaking styles of speech using only a small amount of training data of a target speaker. We model spectral and fundamental frequency (F0) features simultaneously using multi-space probability distribution HMM (MSD-HMM), and adapt a speaker-independent neutral style model to a certain target speaker's style model with a small amount of data using MSD-MLLR which is extended MLLR for MSD-HMM. We perform classification experiments for professional narrators' speech and non-professional speakers' speech and evaluate the performance of proposed technique by comparing with other commonly used classifiers. We show that the proposed technique gives better result than the other classifiers when using a few sentences of target speaker's style data.
Bibliographic reference. Tachibana, Makoto / Kawashima, Keigo / Yamagishi, Junichi / Kobayashi, Takao (2007): "Performance evaluation of HMM-based style classification with a small amount of training data", In INTERSPEECH-2007, 2261-2264.