4th International Conference on Spoken Language Processing

Philadelphia, PA, USA
October 3-6, 1996

Speaker Adaptation Using Tree Structured Shared-State HMMs

Jun Ishii (1), Masahiro Tonomura (1), Shoichi Matsunaga (2)

(1) ATR Interpreting Telecommunications Research Labs., Soraku-gun, Kyoto, Japan
(2) NTT Human Interface Labs., Yokosuka, Kanagawa, Japan

This paper proposes a novel speaker adaptation method that flexibly controls state-sharing of HMMs according to the amount of adaptation data. In our scheme, acoustic modeling is combined with adaptation to efficiently utilize the acoustic models sharing characteristics for adaptation. The shared-state set of HMMs is determined by using tree-structured shared-state HMMs created from the history recorded for acoustic model generation. The proposed method is applied to the parameter-tying and parameter-smoothing techniques. Experiments have been performed on a Japanese phoneme recognition test using continuous density mixture Gaussian HMMs. Using 50 adaptation phrases, a 42% reduction in the phoneme recognition error rate from the speaker-independent model was achieved.

