ISCA Archive Interspeech 2008
ISCA Archive Interspeech 2008

Beyond frame independence: parametric modelling of time duration in speaker and language recognition

Alan McCree, Fred Richardson, Elliot Singer, Douglas A. Reynolds

In this work, we address the question of generating accurate likelihood estimates from multi-frame observations in speaker and language recognition. Using a simple theoretical model, we extend the basic assumption of independent frames to include two refinements: a local correlation model across neighboring frames, and a global uncertainty due to train/test channel mismatch. We present an algorithm for discriminative training of the resulting duration model based on logistic regression combined with a bisection search. We show that using this model we can achieve state-of-the-art performance for the NIST LRE07 task. Finally, we show that these more accurate class likelihood estimates can be combined to solve multiple problems using Bayes' rule, so that we can expand our single parametric back-end to replace all six separate back-ends used in our NIST LRE submission for both closed and open sets.


doi: 10.21437/Interspeech.2008-237

Cite as: McCree, A., Richardson, F., Singer, E., Reynolds, D.A. (2008) Beyond frame independence: parametric modelling of time duration in speaker and language recognition. Proc. Interspeech 2008, 767-770, doi: 10.21437/Interspeech.2008-237

@inproceedings{mccree08_interspeech,
  author={Alan McCree and Fred Richardson and Elliot Singer and Douglas A. Reynolds},
  title={{Beyond frame independence: parametric modelling of time duration in speaker and language recognition}},
  year=2008,
  booktitle={Proc. Interspeech 2008},
  pages={767--770},
  doi={10.21437/Interspeech.2008-237}
}