A Double Joint Bayesian Approach for J-Vector Based Text-dependent Speaker Verification

Ziqiang Shi, Mengjiao Wang, Liu Liu, Huibin Lin, Rujie Liu


J-vector has been proved to be very effective in text-dependent speaker verification with short-duration speech. However, the current state-of-the-art back-end classifiers, e.g. joint Bayesian model, cannot make full use of such deep features. In this paper, we generalize the standard joint Bayesian approach to model the multi-faceted information in the j-vector explicitly and jointly. In our generalization, the j-vector was modeled as a result derived by a generative Double Joint Bayesian (DoJoBa) model, which contains several kinds of latent variables.With DoJoBa, we are able to explicitly build a model that can combine multiple heterogeneous information from the j-vectors. In verification step, we calculated the likelihood to describe whether the two j-vectors having consistent labels or not. On the public RSR2015 data corpus, the experimental results showed that our approach can achieve 0.02\% EER and 0.02\% EER for impostor wrong and impostor correct cases respectively.


 DOI: 10.21437/Odyssey.2018-51

Cite as: Shi, Z., Wang, M., Liu, L., Lin, H., Liu, R. (2018) A Double Joint Bayesian Approach for J-Vector Based Text-dependent Speaker Verification . Proc. Odyssey 2018 The Speaker and Language Recognition Workshop, 365-371, DOI: 10.21437/Odyssey.2018-51.


@inproceedings{Shi2018,
  author={Ziqiang Shi and Mengjiao Wang and Liu Liu and Huibin Lin and Rujie Liu},
  title={A Double Joint Bayesian Approach for J-Vector Based Text-dependent Speaker Verification	},
  year=2018,
  booktitle={Proc. Odyssey 2018 The Speaker and Language Recognition Workshop},
  pages={365--371},
  doi={10.21437/Odyssey.2018-51},
  url={http://dx.doi.org/10.21437/Odyssey.2018-51}
}