11th Annual Conference of the International Speech Communication Association

Makuhari, Chiba, Japan
September 26-30. 2010

Integrate Template Matching and Statistical Modeling for Speech Recognition

Xie Sun, Yunxin Zhao

University of Missouri, USA

We propose a novel approach of integrating template matching with statistical modeling to improve continuous speech recognition. We use multiple Gaussian Mixture Model (GMM) indices to represent each frame of speech templates, use agglomerative clustering to generate template representatives, and use log likelihood ratio as the local distance measure for DTW template matching in lattice rescoring. Experimental results on the TIMIT phone recognition task demonstrated that the proposed approach consistently improved several HMM baselines significantly, where the absolute accuracy gain was 1.69%~1.83% if all training templates were used, and the gain was 1.29%~1.37% if template representatives were used.

Full Paper

Bibliographic reference.  Sun, Xie / Zhao, Yunxin (2010): "Integrate template matching and statistical modeling for speech recognition", In INTERSPEECH-2010, 74-77.