ISCA Archive Interspeech 2006
ISCA Archive Interspeech 2006

Discriminative kernel-based phoneme sequence recognition

Joseph Keshet, Shai Shalev-Shwartz, Samy Bengio, Yoram Singer, Dan Chazan

We describe a new method for phoneme sequence recognition given a speech utterance, which is not based on the HMM. In contrast to HMMbased approaches, our method uses a discriminative kernel-based training procedure in which the learning process is tailored to the goal of minimizing the Levenshtein distance between the predicted phoneme sequence and the correct sequence. The phoneme sequence predictor is devised by mapping the speech utterance along with a proposed phoneme sequence to a vector-space endowed with an inner-product that is realized by a Mercer kernel. Building on large margin techniques for predicting whole sequences, we are able to devise a learning algorithm which distills to separating the correct phoneme sequence from all other sequences. We describe an iterative algorithm for learning the phoneme sequence recognizer and further describe an efficient implementation of it. We present initial encouraging experimental results with the TIMIT and compare the proposed method to an HMM-based approach.

doi: 10.21437/Interspeech.2006-217

Cite as: Keshet, J., Shalev-Shwartz, S., Bengio, S., Singer, Y., Chazan, D. (2006) Discriminative kernel-based phoneme sequence recognition. Proc. Interspeech 2006, paper 1284-Mon3BuP.2, doi: 10.21437/Interspeech.2006-217

  author={Joseph Keshet and Shai Shalev-Shwartz and Samy Bengio and Yoram Singer and Dan Chazan},
  title={{Discriminative kernel-based phoneme sequence recognition}},
  booktitle={Proc. Interspeech 2006},
  pages={paper 1284-Mon3BuP.2},