EUROSPEECH '91
Second European Conference on Speech Communication and Technology

Genova, Italy
September 24-26, 1991

                 

Speaker Adaptation based on Articulatory Features

O. Schmidbauer, H. Höge

Siemens AG, ZFE IS KOM31, München, Germany

We present an approach for rapid speaker-adaptation which both reduces inter-speaker variability on the acoustic level and permits dynamic adaptation of the system's reference model. In contrast to other methods we are using a two level approach, we (1) dynamicly normalize speech parameters (formants) to speaker specific means and variances, and (2) we are using an articulatory based representation which is situated between the acoustic and phonemic level. Performance was evaluated on a vocabulary independent continuous speech task with perplexity 120. We achieved 9. 2% word error using only 10 short sentences for adaptation to a new speaker; the error rates for the speaker-dependent and cross-speaker mode are 8. 5% and 24. 7% respectively. The results show that the articulatory representation is relatively speaker-invariant and can be "tuned" to a new speaker with only a small amount of training samples. Keywords: two-step speaker-adaptation method, normalized formant features, articulatory-feature vector (AFV), Hidden Markov Models.

Full Paper

Bibliographic reference.  Schmidbauer, O. / Höge, H. (1991): "Speaker adaptation based on articulatory features", In EUROSPEECH-1991, 1099-1102.