8th Annual Conference of the International Speech Communication Association

Antwerp, Belgium
August 27-31, 2007

An Evaluation of Cross-Language Adaptation and Native Speech Training for Rapid HMM Construction Based on Very Limited Training Data

Xufang Zhao, Douglas O'Shaughnessy

Université du Québec, Canada

As the needs and opportunities for speech technology applications in a variety of languages have grown, methods for rapid transfer of speech technology across languages have become a practical concern. Previous works focus on the comparison of different adaptation algorithms, for example, MAP (Maximum A Posterior), Bootstrap, and MLLR (Maximum Likelihood Linear Regression) on speaker adaptation. However, a very interesting point is that, with increasing adaptation corpora, the performance of direct native speech training may already exceed the performance of cross-language adaptation. If it is true, there should be a threshold for the size of an adaptation corpus. In general, transferring acoustic knowledge is useful when there is not enough training data available. This paper presents a systematic comparison of the relative effectiveness of cross-language adaptation and native speech training, using transfer from English to Mandarin as a test case. This study found that cross-language adaptation does not produce better acoustic models than the direct native speech training approach even using limited training data.

Full Paper

Bibliographic reference.  Zhao, Xufang / O'Shaughnessy, Douglas (2007): "An evaluation of cross-language adaptation and native speech training for rapid HMM construction based on very limited training data", In INTERSPEECH-2007, 1433-1436.