INTERSPEECH 2015

Ivectors are a wellknown lowdimensional representation of speaker space and are becoming increasingly popular in adaptation of stateoftheart deep neural network (DNN) acoustic models. One advantage of ivectors is that they can be used with very little data, for example a single utterance. However, to improve robustness of the ivector estimates with limited data, a prior is often used. Traditionally, a standard normal prior is applied to ivectors, which is nevertheless not well suited to the increased variability of short utterances. This paper proposes a more informative prior, derived from the training data. As well as aiming to reduce the nonGaussian behaviour of the ivector space, it allows prior information at different levels, for example gender, to be used. Experiments on a US English Broadcast News (BN) transcription task for speaker and utterance ivector adaptation show that more informative priors reduce the sensitivity to the quantity of data used to estimate the ivector. The best configuration for this task was utterancelevel test ivectors enhanced with informative priors which gave a 13% relative reduction in word error rate over the baseline (no ivectors) and a 5% over utterancelevel test ivectors with standard prior.
Bibliographic reference. Karanasou, Penny / Gales, Mark J. F. / Woodland, Philip C. (2015): "Ivector estimation using informative priors for adaptation of deep neural networks", In INTERSPEECH2015, 28722876.