11th Annual Conference of the International Speech Communication Association

Makuhari, Chiba, Japan
September 26-30. 2010

Overlap Detection for Speaker Diarization by Fusing Spectral and Spatial Features

Martin Zelenák, Carlos Segura, Javier Hernando

Universitat Politècnica de Catalunya, Spain

A substantial portion of errors of the conventional speaker diarization systems on meeting data can be accounted to overlapped speech. This paper proposes the use of several spatial features to improve speech overlap detection on distant channel microphones. These spatial features are integrated into a spectral-based system by using principal component analysis and neural networks. Different overlap detection hypotheses are used to improve diarization performance with both overlap exclusion and overlap labeling. In experiments conducted on AMI Meeting Corpus we demonstrate a relative DER improvement of 11.6% and 14.6% for single- and multi-site data, respectively.

Full Paper

Bibliographic reference.  Zelenák, Martin / Segura, Carlos / Hernando, Javier (2010): "Overlap detection for speaker diarization by fusing spectral and spatial features", In INTERSPEECH-2010, 2302-2305.