8th International Conference on Spoken Language Processing

Jeju Island, Korea
October 4-8, 2004

On Using MLP Features in LVCSR

Qifeng Zhu (1), Barry Chen (1), Nelson Morgan (1), Andreas Stolcke (2)

(1) International Computer Science Institute, USA
(2) SRI international, USA

One of the major research thrusts in the speech group at ICSI is to use Multi-Layer Perceptron (MLP) based features in automatic speech recognition (ASR). This paper presents a study of three aspects of this effort: 1) the properties of the MLP features which make them useful, 2) incorporating MLP features together with PLP features in ASR, and 3) possible redundancy between MLP features and more conventional system refinements such as discriminative training and system combination. The paper shows that MLP transformations yield variables that have regular distributions, which can be further modified by using logarithm to make the distribution easier to model by a Gaussian-HMM. Two or more vectors of these features can easily be combined without increasing the feature dimension. Recognition results show that MLP features can significantly improve recognition performance in large vocabulary continuous speech recognition (LVCSR) tasks for the NIST 2001 Hub-5 evaluation set with models trained on the Switchboard Corpus, even when discriminative training and system combination are used.

Full Paper

Bibliographic reference.  Zhu, Qifeng / Chen, Barry / Morgan, Nelson / Stolcke, Andreas (2004): "On using MLP features in LVCSR", In INTERSPEECH-2004, 921-924.