10th Annual Conference of the International Speech Communication Association

Brighton, United Kingdom
September 6-10, 2009

Development of the GALE 2008 Mandarin LVCSR System

C. Plahl, Björn Hoffmeister, Georg Heigold, Jonas Lööf, Ralf Schlüter, Hermann Ney

RWTH Aachen University, Germany

This paper describes the current improvements of the RWTH Mandarin LVCSR system. We introduce vocal tract length normalization for the Gammatone features and present comparable results for Gammatone based feature extraction and classical feature extraction. In order to benefit from the huge amount of data of 1600h available in the GALE project we have trained the acoustic models up to 8M Gaussians. We present detailed character error rates for the different number of Gaussians.

Different kinds of systems are developed and a two stage decoding framework is applied, which uses cross-adaptation and a subsequent lattice-based system combination. In addition to various acoustic front-ends, these systems use different kinds of neural network toneme posterior features. We present detailed recognition results of the development cycle and the different acoustic front-ends of the systems. Finally, we compare the ultimate evaluation system to our last years system and can report a 10% relative improvement.

Full Paper

Bibliographic reference.  Plahl, C. / Hoffmeister, Björn / Heigold, Georg / Lööf, Jonas / Schlüter, Ralf / Ney, Hermann (2009): "Development of the GALE 2008 Mandarin LVCSR system", In INTERSPEECH-2009, 2107-2110.