Latent Factor Analysis of Deep Bottleneck Features for Speaker Verification with Random Digit Strings

Ziqiang Shi, Huibin Lin, Liu Liu, Rujie Liu


Speaker verification with prompted random digit strings has been a challenging task due to very short test utterance. This work investigates how to combine methods from deep bottleneck features (DBF) and latent factor analysis (LFA) to result in a new state-of-the-art approach for such task. In order to provide a wider temporal context, a stacked DBF is extracted to replace the traditional MFCC feature in the derivation of the supervector representations and leads to a significant improvement for the speaker verification. The LFA is used to model these stacked DBFs in both digit and utterance scales. Based on this learned LFA model, two kinds of supervector representations are extracted for utterance and local digits respectively. Since the strengths of DBF and LFA appear complementary, the combination significantly outperforms either of its components. Experiments have been conducted on the public RSR2015 part III data corpus, the results showed that our approach can achieve 1.40% EER and 1.55% EER on male and female respectively.


 DOI: 10.21437/Interspeech.2018-1422

Cite as: Shi, Z., Lin, H., Liu, L., Liu, R. (2018) Latent Factor Analysis of Deep Bottleneck Features for Speaker Verification with Random Digit Strings. Proc. Interspeech 2018, 1081-1085, DOI: 10.21437/Interspeech.2018-1422.


@inproceedings{Shi2018,
  author={Ziqiang Shi and Huibin Lin and Liu Liu and Rujie Liu},
  title={Latent Factor Analysis of Deep Bottleneck Features for Speaker Verification with Random Digit Strings},
  year=2018,
  booktitle={Proc. Interspeech 2018},
  pages={1081--1085},
  doi={10.21437/Interspeech.2018-1422},
  url={http://dx.doi.org/10.21437/Interspeech.2018-1422}
}