10th Annual Conference of the International Speech Communication Association

Brighton, United Kingdom
September 6-10, 2009

Forensic Speaker Recognition Using Traditional Features Comparing Automatic and Human-in-the-Loop Formant Tracking

Alberto de Castro, Daniel Ramos, Joaquin Gonzalez-Rodriguez

Universidad Autónoma de Madrid, Spain

In this paper we compare forensic speaker recognition with traditional features using two different formant tracking strategies: one performed automatically and one semi-automatic performed by human experts. The main contribution of the work is the use of an automatic method for formant tracking, which allows a much faster recognition process and the use of a much higher amount of data for modelling background population, calibration, etc. This is especially important in likelihood-ratio-based forensic speaker recognition, where the variation of features among a population of speakers must be modelled in a statistically robust way. Experiments show that, although recognition using the human-in-the-loop approach is better than using the automatic scheme, the performance of the latter is also acceptable. Moreover, we present a novel feature selection method which allows the analysis of which feature of each formant has a greater contribution to the discriminating power of the whole recognition process, which can be used by the expert in order to decide which features in the available speech material are important.

Full Paper

Bibliographic reference.  Castro, Alberto de / Ramos, Daniel / Gonzalez-Rodriguez, Joaquin (2009): "Forensic speaker recognition using traditional features comparing automatic and human-in-the-loop formant tracking", In INTERSPEECH-2009, 2343-2346.