9th Annual Conference of the International Speech Communication Association

Brisbane, Australia
September 22-26, 2008

Implicit State-Tying for Support Vector Machines Based Speech Recognition

Daniel Bolaños, Wayne Ward

University of Colorado at Boulder, USA

In this article we take a step forward towards the application of Support Vector Machines (SVMs) to continuous speech recognition. As in previous work, we use SVMs to estimate emission probabilities in the context of an SVM/HMM system. However, training pairwise classifiers to discriminate between some of the HMM-states of very close phonetic classes produce unsatisfactory results. We propose a data-driven approach for selecting the HMM-states for which SVMs are trained and those ones that are implicitly tied.

Additionally we introduce an algorithm that is incorporated into the decoder for dynamically selecting the subset of SVMs used to estimate the emission probabilities. This algorithm reduces the number of SVMs evaluated at the frame level dramatically while preserving recognition accuracy. We present results in a very challenging corpora composed of children's speech. Our approach not only outperforms comparable GMM/HMM based systems but other SVM/HMM systems proposed to date.

Full Paper

Bibliographic reference.  Bolaños, Daniel / Ward, Wayne (2008): "Implicit state-tying for support vector machines based speech recognition", In INTERSPEECH-2008, 924-927.