ISCA Archive Interspeech 2008
ISCA Archive Interspeech 2008

Addressing database mismatch in forensic speaker recognition with Ahumada III: a public real-casework database in Spanish

Daniel Ramos, Joaquin Gonzalez-Rodriguez, Javier Gonzalez-Dominguez, Jose Juan Lucena-Molina

This paper presents and describes Ahumada III, a speech database in Spanish collected from real forensic cases. In its current release, the database presents 61 male speakers recorded using the systems and procedures followed by Spanish Guardia Civil police force. The paper also explores the usefulness of such a corpus for facing the important problem of database mismatch in speaker recognition, understood as the difference between the database used for tuning a speaker recognition system and the data which the system will handle in operational conditions. This problem is typical in forensics, where variability in speech conditions may be extreme and difficult to model. Therefore, this work also presents a study evaluating the impact of such problem, for which a corpus quoted as NIST4M (NIST MultiMic MisMatch) has been constructed from NIST SRE 2006 data. NIST4M presents microphone data both in the enrolled models and in the test segments, allowing the generation of trials in a variety of strongly mismatching conditions. Database mismatch is simulated by eliminating some microphone channels of interest from the background data, and computing scores with speech from such microphones in unknown testing conditions as usually happens in forensic speaker recognition. Finally, we show how the incorporation of Ahumada III as background data is useful to face database mismatch in real-world forensic conditions.


doi: 10.21437/Interspeech.2008-429

Cite as: Ramos, D., Gonzalez-Rodriguez, J., Gonzalez-Dominguez, J., Lucena-Molina, J.J. (2008) Addressing database mismatch in forensic speaker recognition with Ahumada III: a public real-casework database in Spanish. Proc. Interspeech 2008, 1493-1496, doi: 10.21437/Interspeech.2008-429

@inproceedings{ramos08_interspeech,
  author={Daniel Ramos and Joaquin Gonzalez-Rodriguez and Javier Gonzalez-Dominguez and Jose Juan Lucena-Molina},
  title={{Addressing database mismatch in forensic speaker recognition with Ahumada III: a public real-casework database in Spanish}},
  year=2008,
  booktitle={Proc. Interspeech 2008},
  pages={1493--1496},
  doi={10.21437/Interspeech.2008-429}
}