Two-Pass IB Based Speaker Diarization System Using Meeting-Specific ANN Based Features

Nauman Dawalatabad, Srikanth Madikeri, Chandra Sekhar C., Hema A. Murthy


In this paper, we present a two-pass Information Bottleneck (IB) based system for speaker diarization which uses meeting-specific artificial neural network (ANN) based features. We first use IB based speaker diarization system to get the labelled speaker segments. These segments are re-segmented using Kullback-Leibler Hidden Markov Model (KL-HMM) based re-segmentation. The multi-layer ANN is then trained to discriminate these speakers using the re-segmented output labels and the spectral features. We then extract the bottleneck features from the trained ANN and perform principal component analysis (PCA) on these features. After performing PCA, these bottleneck features are used along with the different spectral features in the second pass using the same IB based system with KL-HMM re-segmentation. Our experiments on NIST RT and AMI datasets show that the proposed system performs better than the baseline IB system in terms of speaker error rate (SER) with a best case relative improvement of 28.6% amongst AMI datasets and 27.1% on NIST RT04eval dataset.


DOI: 10.21437/Interspeech.2016-714

Cite as

Dawalatabad, N., Madikeri, S., C., C.S., Murthy, H.A. (2016) Two-Pass IB Based Speaker Diarization System Using Meeting-Specific ANN Based Features. Proc. Interspeech 2016, 2199-2203.

Bibtex
@inproceedings{Dawalatabad+2016,
author={Nauman Dawalatabad and Srikanth Madikeri and Chandra Sekhar C. and Hema A. Murthy},
title={Two-Pass IB Based Speaker Diarization System Using Meeting-Specific ANN Based Features},
year=2016,
booktitle={Interspeech 2016},
doi={10.21437/Interspeech.2016-714},
url={http://dx.doi.org/10.21437/Interspeech.2016-714},
pages={2199--2203}
}