8th Annual Conference of the International Speech Communication Association

Antwerp, Belgium
August 27-31, 2007

Using Direction of Arrival Estimate and Acoustic Feature Information in Speaker Diarization

Eugene Chin Wei Koh (1), Hanwu Sun (2), Tin Lay Nwe (2), Trung Hieu Nguyen (1), Bin Ma (2), Eng Siong Chng (1), Haizhou Li (2), Susanto Rahardja (2)

(1) Nanyang Technological University, Singapore
(2) Institute for Infocomm Research, Singapore

This paper describes the I2R/NTU system submitted for the NIST Rich Transcription 2007 (RT-07) Meeting Recognition evaluation Multiple Distant Microphone (MDM) task. In our implementation, the Direction of Arrival (DOA) information is specifically used to perform speaker turn detection and clustering. Cluster purification is then carried out by performing GMM modeling on acoustic features. Finally, non-speech & silence removal is effected to remove unwanted segments. The system achieved an overall DER of 31.02% on the NIST Rich Transcription Spring 2006 evaluation tasks.

Full Paper

Bibliographic reference.  Koh, Eugene Chin Wei / Sun, Hanwu / Nwe, Tin Lay / Nguyen, Trung Hieu / Ma, Bin / Chng, Eng Siong / Li, Haizhou / Rahardja, Susanto (2007): "Using direction of arrival estimate and acoustic feature information in speaker diarization", In INTERSPEECH-2007, 2149-2152.