Overlapping speech is responsible for a certain amount of errors produced by standard speaker diarization systems in meeting environment. We are investigating a set of prosody-based long-term features as a potential complement to our overlap detection system relying on short-term spectral parameters. The most relevant features are selected in a two-step process. They are firstly evaluated and sorted according to mRMR criterion and then the optimal number is determined by iterative wrapper approach. We show that the addition of prosodic features decreased overlap detection error. Detected overlap segments are used in speaker diarization to recover missed speech by assigning multiple speaker labels and to increase the purity of speaker clusters.
Bibliographic reference. Zelenák, Martin / Hernando, Javier (2011): "The detection of overlapping speech with prosodic features for speaker diarization", In INTERSPEECH-2011, 1041-1044.