8th Annual Conference of the International Speech Communication Association

Antwerp, Belgium
August 27-31, 2007

Hierarchical Neural Networks Feature Extraction for LVCSR System

Fabio Valente (1), Jithendra Vepa (1), Christian Plahl (2), Christian Gollan (2), Hynek Hermansky (1), Ralf Schlüter (2)

(1) IDIAP Research Institute, Switzerland
(2) RWTH Aachen University, Germany

This paper investigates the use of a hierarchy of Neural Networks for performing data driven feature extraction. Two different hierarchical structures based on long and short temporal context are considered. Features are tested on two different LVCSR systems for Meetings data (RT05 evaluation data) and for Arabic Broadcast News (BNAT05 evaluation data). The hierarchical NNs outperforms the single NN features consistently on different type of data and tasks and provides significant improvements w.r.t. respective baselines systems. Best results are obtained when different time resolutions are used at different level of the hierarchy.

Full Paper

Bibliographic reference.  Valente, Fabio / Vepa, Jithendra / Plahl, Christian / Gollan, Christian / Hermansky, Hynek / Schlüter, Ralf (2007): "Hierarchical neural networks feature extraction for LVCSR system", In INTERSPEECH-2007, 42-45.