We propose a two-step active learning method for supervised speaker adaptation. In the first step, the initial adaptation data is collected to obtain a phone error distribution. In the second step, those sentences whose phone distributions are close to the error distribution are selected, and their utterances are collected as the additional adaptation data. We evaluated the method using a Japanese speech database and maximum likelihood linear regression (MLLR) as the speaker adaptation algorithm. We confirmed that our method had a significant improvement over a method using randomly chosen sentences for adaptation.
Bibliographic reference. Shinoda, Koichi / Murakami, Hiroko / Furui, Sadaoki (2009): "Speaker adaptation based on two-step active learning", In INTERSPEECH-2009, 576-579.