11th Annual Conference of the International Speech Communication Association

Makuhari, Chiba, Japan
September 26-30. 2010

The IIR NIST SRE 2008 and 2010 Summed Channel Speaker Recognition Systems

Hanwu Sun, Bin Ma, Chien-Lin Huang, Trung Hieu Nguyen, Haizhou Li

Department of Human Language Technology, Institute for Infocomm Research, A*STAR, Singapore

This paper describes the IIR speaker recognition system for the summed channel evaluation tasks in the 2008 and 2010 NIST SREs. The system includes three main modules: voice activity detection, speaker diarization and speaker recognition. The front-end process employs a spectral subtraction based voice activity detection algorithm for effective speech frame selection. The speaker diarization system applied for the 2007 and 2009 NIST RTs is adopted for the summed channel speech segmentation. A hybrid purifying and clustering algorithm is used to cluster the summed channel speech into two speaker clusters. The GMM-SVM speaker recognition system is adopted to evaluate the performance with both MFCC and LPCC features. The system achieves competitive overall EER rates of 3.46% in the 1conv-summed task and 1.87% in the 8conv-summed task, respectively, while only all English trials are involved.

Full Paper

Bibliographic reference.  Sun, Hanwu / Ma, Bin / Huang, Chien-Lin / Nguyen, Trung Hieu / Li, Haizhou (2010): "The IIR NIST SRE 2008 and 2010 summed channel speaker recognition systems", In INTERSPEECH-2010, 366-369.