INTERSPEECH 2009
10th Annual Conference of the International Speech Communication Association

Brighton, United Kingdom
September 6-10, 2009

Hierarchical Processing of the Modulation Spectrum for GALE Mandarin LVCSR System

Fabio Valente (1), Mathew Magimai-Doss (1), C. Plahl (2), Suman Ravuri (3)

(1) IDIAP Research Institute, Switzerland
(2) RWTH Aachen University, Germany
(3) ICSI, USA

This paper aims at investigating the use of TANDEM features based on hierarchical processing of the modulation spectrum. The study is done in the framework of the GALE project for recognition of Mandarin Broadcast data. We describe the improvements obtained using the hierarchical processing and the addition of features like pitch and short-term critical band energy. Results are consistent with previous findings on a different LVCSR task suggesting that the proposed technique is effective and robust across several conditions. Furthermore we describe integration into RWTH GALE LVCSR system trained on 1600 hours of Mandarin data and present progress across the GALE 2007 and GALE 2008 RWTH systems resulting in approximately 20% CER reduction on several data set.

Full Paper

Bibliographic reference.  Valente, Fabio / Magimai-Doss, Mathew / Plahl, C. / Ravuri, Suman (2009): "Hierarchical processing of the modulation spectrum for GALE Mandarin LVCSR system", In INTERSPEECH-2009, 2963-2966.