7th International Conference on Spoken Language Processing

September 16-20, 2002
Denver, Colorado, USA

The 2001 GMTK-Based SPINE ASR System

Özgür Çetin, Harriet J. Nock, Katrin Kirchhoff, Jeff A. Bilmes, Mari Ostendorf

University of Washington, USA

This paper provides a detailed description of the University of Washington automatic speech recognition (ASR) system for the 2001 DARPA SPeech In Noisy Environments (SPINE) task. Our system makes heavy use of the graphical modeling toolkit (GMTK), a general purpose graphical modeling-based ASR system that allows arbitrary parameter tying, flexible deterministic and stochastic dependencies between variables, and a generalized maximum likelihood parameter estimation algorithm. In our SPINE system, GMTK was used for acoustic model training whereas feature extraction, speaker adaptation, and first-pass decoding were performed by HTK. Our integrated GMTK/HTK system demonstrates the relative merits provided by each tool. Novel aspects of our SPINE system include the capturing of correlations among feature vectors via a globally-shared factored sparse inverse covariance matrix and generalized EM training.

Full Paper

Bibliographic reference.  Çetin, Özgür / Nock, Harriet J. / Kirchhoff, Katrin / Bilmes, Jeff A. / Ostendorf, Mari (2002): "The 2001 GMTK-based SPINE ASR system", In ICSLP-2002, 1037-1040.