ISCA Archive Odyssey 2004
ISCA Archive Odyssey 2004

Handling mismatch in corpus-based forensic speaker recognition

Anil Alexander, Filippo Botti, Andrzej Drygajlo

This paper deals with automatic speaker recognition in forensic applications and handling mismatched technical conditions in a Bayesian framework for evaluating the strength of evidence. Mismatch in recording conditions has to be considered in the estimation of the strength of evidence, i.e., how likely it is that a questioned recording (trace) has been produced by a suspected speaker rather than by any other person from a relevant population. In forensic speaker recognition, in order to estimate such a likelihood ratio, a Bayesian interpretation framework and a corpus based methodology is employed.

Although automatic speaker recognition has shown high performance under controlled conditions, the conditions in which recordings are made by the police (anonymous calls and wiretapping) cannot be controlled and are far from ideal. Differences in the phone handset, in the transmission channel and in the recording tools introduce a variability, over and above the variability of human speech. In this paper we focus on how to estimate and deal with differences in recording conditions of the databases used: detection of whether there is good discrimination between speakers within a database, detection of significant mismatch in recording conditions and statistical compensation in case of mismatch.

Cite as: Alexander, A., Botti, F., Drygajlo, A. (2004) Handling mismatch in corpus-based forensic speaker recognition. Proc. The Speaker and Language Recognition Workshop (Odyssey 2004), 69-74

  author={Anil Alexander and Filippo Botti and Andrzej Drygajlo},
  title={{Handling mismatch in corpus-based forensic speaker recognition}},
  booktitle={Proc. The Speaker and Language Recognition Workshop (Odyssey 2004)},