16th Annual Conference of the International Speech Communication Association

Dresden, Germany
September 6-10, 2015

An Information Theory Based Data-Homogeneity Measure for Voice Comparison

Moez Ajili (1), Jean-François Bonastre (1), Solange Rossato (2), Juliette Kahn (3), Itshak Lapidot (4)

(1) LIA, France
(2) LIG (UMR 5217), France
(3) LNE, France
(4) Afeka, Israel

In forensic voice comparison, it is strongly recommended to follow the Bayesian paradigm to present a forensic evidence to the court. In this paradigm, the strength of the forensic evidence is summarized by a likelihood ratio (LR). Theoretically, a LR embeds intrinsically the reliability information. So the LR could belong to large values in good conditions, about 10^±10, while in bad conditions, the LR should be very close to one. But, in the real world, forensic processes are only proposing an empirical estimation of the LRs, sometime far from the theoretical ones and unable to embed reliability information. It is particularly true for speaker recognition systems. They are outputting a score in all situations regardless of the case specific conditions and use some normalization steps in order to see this score as a LR. Consequently, the reliability have to be taken into account separately to the LR, in order to allow to the forensic expert to make an appropriate judgement. The reliability depends firstly on the two signals which compose a voice comparison trial. The presence of speaker specific information and the homogeneity of this information between the two signals of a given voice comparison trial should be evaluated. This paper is dedicated to the latter, the homogeneity. We propose an information theory (IT) based homogeneity measure which determines whether a voice comparison is feasible or not, regardless of the used system.

Full Paper

Bibliographic reference.  Ajili, Moez / Bonastre, Jean-François / Rossato, Solange / Kahn, Juliette / Lapidot, Itshak (2015): "An information theory based data-homogeneity measure for voice comparison", In INTERSPEECH-2015, 3451-3455.