INTERSPEECH 2009
10th Annual Conference of the International Speech Communication Association

Brighton, United Kingdom
September 6-10, 2009

Trimmed KL Divergence Between Gaussian Mixtures for Robust Unsupervised Acoustic Anomaly Detection

Nash Borges, Gerard G. L. Meyer

Johns Hopkins University, USA

In previous work [1], we presented several implementations of acoustic anomaly detection by training a model on purely normal data and estimating the divergence between it and other input. Here, we reformulate the problem in an unsupervised framework and allow for anomalous contamination of the training data. We focus exclusively on methods employing Gaussian mixture models (GMMs) since they are often used in speech processing systems. After analyzing what caused the Kullback-Leibler (KL) divergence between GMMs to break down in the face of training contamination, we came up with a promising solution. By trimming one quarter of the most divergent Gaussians from the mixture model, we significantly outperformed the untrimmed approximation for contamination levels of 10% and above, reducing the equal error rate from 33.8% to 6.4% at 33% contamination. The performance of the trimmed KL divergence showed no significant dependence on the investigated contamination levels.

Reference

  1. N. Borges and G. G. L. Meyer, “Unsupervised distributional anomaly detection for a self-diagnostic speech activity detector,” in CISS, 2008.

Full Paper

Bibliographic reference.  Borges, Nash / Meyer, Gerard G. L. (2009): "Trimmed KL divergence between Gaussian mixtures for robust unsupervised acoustic anomaly detection", In INTERSPEECH-2009, 2555-2558.