LOCUST - Longitudinal Corpus and Toolset for Speaker Verification

Evgeny Dmitriev, Yulia Kim, Anastasia Matveeva, Claude MontaciƩ, Yannick Boulard, Yadviga Sinyavskaya, Yulia Zhukova, Adam Zarazinski, Egor Akhanov, Ilya Viksnin, Andrei Shlykov, Maria Usova

In this paper, we set forth a new longitudinal corpus and a toolset in an effort to address the influence of voice-aging on speaker verification. We have examined previous longitudinal research of age-related voice changes as well as its applicability to real world use cases. Our findings reveal that scientists have treated age-related voice changes as a hindrance instead of leveraging it to the advantage of the identity validator. Additionally, we found a significant dearth of publicly available corpora related to both the time span of and the number of participants in audio recordings. We also identified a significant bias toward the development of speaker recognition technologies applicable to government surveillance systems compared to speaker verification systems used in civilian IT security systems. To solve the aforementioned issues, we built an open project with the largest publicly available longitudinal speaker database, which includes 229 speakers with an average talking time exceeding 15 hours spanning across an average of 21 years per speaker. We assembled, cleaned and normalized audio recordings and developed software tools for speech features extractions, all of which we are releasing to the public domain.

 DOI: 10.21437/Interspeech.2018-2412

Cite as: Dmitriev, E., Kim, Y., Matveeva, A., MontaciƩ, C., Boulard, Y., Sinyavskaya, Y., Zhukova, Y., Zarazinski, A., Akhanov, E., Viksnin, I., Shlykov, A., Usova, M. (2018) LOCUST - Longitudinal Corpus and Toolset for Speaker Verification. Proc. Interspeech 2018, 1096-1100, DOI: 10.21437/Interspeech.2018-2412.

  author={Evgeny Dmitriev and Yulia Kim and Anastasia Matveeva and Claude MontaciƩ and Yannick Boulard and Yadviga Sinyavskaya and Yulia Zhukova and Adam Zarazinski and Egor Akhanov and Ilya Viksnin and Andrei Shlykov and Maria Usova},
  title={LOCUST - Longitudinal Corpus and Toolset for Speaker Verification},
  booktitle={Proc. Interspeech 2018},