Combining four text independent speaker recognition methods

P. Thévenaz, H. Hügli

This paper deals with automatic text independent speaker recognition in a telephone bandwidth context. First, the meaning of text independence is reviewed; then, we present our solution to this problem.

Our aim is to get a sufficient number of different methods, in order to fruitfully combine them. Hence we present four methods of text independent speaker verification. Algorithms and performances are individually analyzed before we attempt to combine them. These methods are essentially statistical in nature; they make use of cepstral vectors obtained by LPC analysis.

The first method simply characterizes the speaker by his mean cepstrum. The second method is based on the accumulation of vector quantization error of a locution by the speaker's codebook. The third method is derived from the second one by using differential cepstral vectors instead. The fourth and last method exploits the histogram of entries in a universal cepstrum codebook, according to a vector quantization technique.

The combination of the resulting distances given by these four methods is achieved by a Fisher linear discriminant analysis, which provides a great improvement in performances over any single method. The performances achieved are compared to what can be found elsewhere in the literature.

