In this study, we develop a new system for real world audio environment matching. Environment detection within unknown audio streams requires a system that operates in an unsupervised manner since it will be faced with unknown environments with- out prior information. In addition, the overall solution should be computationally efficient for large audio collection. In the pro- posed approach, a Gaussian mixture model(GMM) is trained on large amounts of unlabeled audio data and used as a back- ground acoustic model. Subsequently, an acoustic signature vector (ASV) is computed for each environment. Here, the ASV vector is designed to capture the unique acoustic characteristics of an environment. Using the ASV vectors, we demonstrate that it is possible to compute an effective similarity measure between two acoustic environments. We demonstrate the per- formance of the proposed system on real-world audio data, and compare it to a traditional GMM-UBM (Universal Background Model) system. Experiments show that our system achieves an equal error rate (EER) that is +35% better than a baseline GMM-UBM system.
Index Terms: Audio Environment Detection, Acoustic Signature, Real word audio data, Prof-Life-Log
Bibliographic reference. Ziaei, Ali / Sangwan, Abhijeet / Hansen, John H. L. (2012): "Prof-life-log: audio environment detection for naturalistic audio streams", In INTERSPEECH-2012, 2514-2517.