8th International Conference on Spoken Language Processing

Jeju Island, Korea
October 4-8, 2004

Speaker Indexing In Audio Archives Using Test Utterance Gaussian Mixture Modeling

Hagai Aronowitz (1), David Burshtein (2), Amihood Amir (1)

(1) Bar-Ilan University, Israel
(2) Tel-Aviv University, Israel

Speaker Indexing has recently emerged as an important task due to the rapidly growing volume of audio archives. Current filtration techniques still suffer from problems both in accuracy and efficiency. The major reason for the drawbacks of existing solutions is the use of inaccurate anchor models. The contribution of this paper is two-fold. On the theoretical side, a new method is developed for simulating GMM scoring. This enables to fit a GMM not only to every target speaker but also to every test call, and then compute the likelihood of the test utterance using these GMMs instead of using the original data. The second, contribution of this paper is in harnessing this GMM simulation to achieve very efficient speaker indexing in terms of both search time and index size. Results on the SPIDRE corpus show that our approach maintains the accuracy of the conventional GMM algorithm.

Full Paper

Bibliographic reference.  Aronowitz, Hagai / Burshtein, David / Amir, Amihood (2004): "Speaker indexing in audio archives using test utterance Gaussian mixture modeling", In INTERSPEECH-2004, 609-612.