An important aspect of SVM-based speaker verification systems is the choice of dynamic kernel. For the GLDS kernel, a static kernel is used to map each observation into a higher order feature space. Features are then obtained by taking a simple average over all frames. Derivative kernels, such as the Fisher kernel, use a generative model as a principled way of extracting a fixed set of features from each utterance. However, the model and features are defined using the original observations. Here, a dynamic kernel is described that combines these two approaches. In general, it is not possible to explicitly train a model in the feature space associated with a static kernel. However, by using a suitable metric with approximate component posteriors, this form of dynamic kernel can be computed. This kernel generalises the GLDS and derivative kernel as special cases and is also closely related to parametric kernels such as the GMM-supervector kernel. Preliminary results using this kernel are presented on the 2002 NIST SRE dataset.
Bibliographic reference. Longworth, C. / Gales, M. J. F. (2008): "A generalised derivative kernel for speaker verification", In INTERSPEECH-2008, 1381-1384.