In this paper we present an algorithm that makes use of information contained in syllable lattices to significantly reduce the classification error rate of a children's speech reading tracker. The task is to verify whether each word in a reference string was actually spoken. A syllable graph is generated from the reference word string to represent acceptable pronunciation alternatives. A syllable based continuous speech recognizer is used to generate a syllable lattice. The best alignment between the reference graph and the syllable lattice is determined using a dynamic programming algorithm. The speech vectors that are aligned with each syllable are used as features for Support Vector Machine classifiers that accept or reject each syllable in the aligned path.
Experimental results over three children's speech corpora show that this algorithm can substantially reduce the classification error rate over the standard word based tracker and over a simple best-path syllable based tracker.
Bibliographic reference. Bolanos, Daniel / Ward, Wayne / Vuuren, Sarel Van / Garrido, Javier (2007): "Syllable lattices as a basis for a children's speech reading tracker", In INTERSPEECH-2007, 198-201.