ISCA Archive Eurospeech 1999
ISCA Archive Eurospeech 1999

Robust speaker adaptation of continuous density HMMS using multilayer perceptron network

Mikko Harju, Petri Salmela, Olli Viikki, Mikko Lehtokangas, Jukka Saarinen

The performance of global affine and nonlinear trans-formations for speaker adaptation in a hidden Markov model (HMM) speech recognition system are compared in this paper. The nonlinear transformation was obtained with a multilayer perceptron network (MLP) which was trained during the adaptation process to transform the mean vectors of the HMMs such that the output proba-bilities of the HMMs for the adaptation utterances were maximized. The performance of the MLP adaptation method was compared to the maximum likelihood linear regression (MLLR) adaptation procedure. Both of these methods were tested in a connected digit speech recogni-tion system using multi-environment models. The results show that the nonlinear MLP transformation clearly out-performs MLLR in terms of adaptation speed. Moreover, the performance of MLP adaptation with larger amounts of data was comparable to the MLLR performance.


doi: 10.21437/Eurospeech.1999-546

Cite as: Harju, M., Salmela, P., Viikki, O., Lehtokangas, M., Saarinen, J. (1999) Robust speaker adaptation of continuous density HMMS using multilayer perceptron network. Proc. 6th European Conference on Speech Communication and Technology (Eurospeech 1999), 2499-2502, doi: 10.21437/Eurospeech.1999-546

@inproceedings{harju99_eurospeech,
  author={Mikko Harju and Petri Salmela and Olli Viikki and Mikko Lehtokangas and Jukka Saarinen},
  title={{Robust speaker adaptation of continuous density HMMS using multilayer perceptron network}},
  year=1999,
  booktitle={Proc. 6th European Conference on Speech Communication and Technology (Eurospeech 1999)},
  pages={2499--2502},
  doi={10.21437/Eurospeech.1999-546}
}