Uncertainty propagation is an established approach to handle noisy and reverberant conditions in automatic speech recognition (ASR), but it has little been studied for speaker recognition so far. Yu et al. recently proposed to propagate uncertainty to the Baum-Welch (BW) statistics without changing the posterior probability of each mixture component. They obtained good results on a small dataset (YOHO) but little improvement on the NIST-SRE dataset, despite the use of oracle uncertainty estimates. In this paper, we propose to modify the computation of the posterior probability of each mixture component in order to obtain unbiased BW statistics. We show that our approach improves the accuracy of BW statistics on the Wall Street Journal (WSJ) corpus, but yields little or no improvement on NIST-SRE again. We provide a theoretical explanation for this that opens the way for more efficient exploitation of uncertainty on NIST-SRE and other large datasets in the future.
Bibliographic reference. Ribas, Dayana / Vincent, Emmanuel / Calvo, José Ramón (2015): "Uncertainty propagation for noise robust speaker recognition: the case of NIST-SRE", In INTERSPEECH-2015, 3536-3540.