EUROSPEECH 2003 - INTERSPEECH 2003
One major SVM weakness has been the use of generic kernel functions to compute distances among data points. Polynomial, linear, and Gaussian are typical examples. They do not take full advantage of the inherent probability distributions of the data. Focusing on audio speaker identification and verification, we propose to explore the use of novel kernel functions that take full advantage of good probabilistic and descriptive models of audio data. We explore the use of generative speaker identification models such as Gaussian Mixture Models and derive a kernel distance based on the Kullback-Leibler (KL) divergence between generative models. In effect our approach combines the best of both generative and discriminative methods. Our results show that these new kernels perform as well as baseline GMM classifiers and outperform generic kernel based SVM's in both speaker identification and verification on two different audio databases.
Bibliographic reference. Moreno, Pedro J. / Ho, Purdy P. (2003): "A new SVM approach to speaker identification and verification using probabilistic distance kernels", In EUROSPEECH-2003, 2965-2968.