INTERSPEECH 2010
11th Annual Conference of the International Speech Communication Association

Makuhari, Chiba, Japan
September 26-30. 2010

A Comparative Large Scale Study of MLP Features for Mandarin ASR

Fabio Valente (1), Mathew Magimai Doss (1), Christian Plahl (2), Suman V. Ravuri (3), Wen Wang (4)

(1) Idiap Research Institute, Switzerland
(2) RWTH Aachen University, Germany
(3) ICSI, USA
(4) SRI International, USA

MLP based front-ends have shown significant complementary properties to conventional spectral features. As part of DARPA GALE program, different MLP features were developed for Mandarin ASR. In this paper, all the proposed front-ends are compared in systematic manner and we extensively investigate the scalability of these features in terms of amount of training data (from 100 hours to 1600 hours) and system complexity (maximum likelihood training, SAT training, lattice level combination, and discriminative training). Results on 5 hours of evaluation data from the GALE project reveal that the MLP features consistently produce relative improvements in the range of (15%-23%) at the different step of a multipass system when compared to the conventional short-term spectral based features like MFCC and PLP.

Full Paper

Bibliographic reference.  Valente, Fabio / Doss, Mathew Magimai / Plahl, Christian / Ravuri, Suman V. / Wang, Wen (2010): "A comparative large scale study of MLP features for Mandarin ASR", In INTERSPEECH-2010, 2630-2633.