In this work, we investigate a dimensionality reduction scheme to use Time Delay of Arrival(TDOA) features across all microphones in a traditional HMM/GMM system. The subspace dimension is selected based on dimension of the TDOA vectors in an ideal recording, i.e., without environmental distortion or interference. Experiments in a dataset used in NIST Meeting Diarization evaluation reveal that the dimensionality reduction to a considerably lower dimension improve the diarization error by 3.7%(30% relative). While the proposed scheme has the advantage that it does not require any development set tuning to select the dimension as proposed by previous methods, it retains competitive performance (5% better than tuning the results).
Index Terms: Speaker diarization, Time Delay of Arrival, Dimensionality reduction
Cite as: Vijayasenan, D., Valente, F. (2012) Dimensionality reduction of large TDOA vectors for speaker diarization. Proc. SAPA-SCALE conference (SAPA 2012), 64-67
@inproceedings{vijayasenan12_sapa, author={Deepu Vijayasenan and Fabio Valente}, title={{Dimensionality reduction of large TDOA vectors for speaker diarization}}, year=2012, booktitle={Proc. SAPA-SCALE conference (SAPA 2012)}, pages={64--67} }