ISCA Archive Interspeech 2013
ISCA Archive Interspeech 2013

Handling recordings acquired simultaneously over multiple channels with PLDA

Jesús Villalba, Mireia Diez, Amparo Varona, Eduardo Lleida

In some speaker recognition scenarios we find conversations recorded simultaneously over multiple channels. That is the case of the interviews in the NIST SRE dataset. To take advantage of that, we propose a modification of the PLDA model that considers two different inter-session variability terms. The first term is tied between all the recordings belonging to the same conversation whereas the second is not. Thus, the former mainly intends to capture the variability due to the phonetic content of the conversation while the latter tries to capture the channel variability. We test this approach on the NIST SRE12 core condition using multiple channels per interview to enroll the speakers. The proposed approach improves the minimum DCF by 26.29% on telephone speech and by 1.8% on interviews compared to the standard PLDA (scored by the book).


doi: 10.21437/Interspeech.2013-420

Cite as: Villalba, J., Diez, M., Varona, A., Lleida, E. (2013) Handling recordings acquired simultaneously over multiple channels with PLDA. Proc. Interspeech 2013, 2509-2513, doi: 10.21437/Interspeech.2013-420

@inproceedings{villalba13_interspeech,
  author={Jesús Villalba and Mireia Diez and Amparo Varona and Eduardo Lleida},
  title={{Handling recordings acquired simultaneously over multiple channels with PLDA}},
  year=2013,
  booktitle={Proc. Interspeech 2013},
  pages={2509--2513},
  doi={10.21437/Interspeech.2013-420}
}