7th International Conference on Spoken Language Processing
September 16-20, 2002
This paper provides a detailed description of the University of Washington automatic speech recognition (ASR) system for the 2001 DARPA SPeech In Noisy Environments (SPINE) task. Our system makes heavy use of the graphical modeling toolkit (GMTK), a general purpose graphical modeling-based ASR system that allows arbitrary parameter tying, flexible deterministic and stochastic dependencies between variables, and a generalized maximum likelihood parameter estimation algorithm. In our SPINE system, GMTK was used for acoustic model training whereas feature extraction, speaker adaptation, and first-pass decoding were performed by HTK. Our integrated GMTK/HTK system demonstrates the relative merits provided by each tool. Novel aspects of our SPINE system include the capturing of correlations among feature vectors via a globally-shared factored sparse inverse covariance matrix and generalized EM training.
Bibliographic reference. Çetin, Özgür / Nock, Harriet J. / Kirchhoff, Katrin / Bilmes, Jeff A. / Ostendorf, Mari (2002): "The 2001 GMTK-based SPINE ASR system", In ICSLP-2002, 1037-1040.