14thAnnual Conference of the International Speech Communication Association

Lyon, France
August 25-29, 2013

Training Log-Linear Acoustic Models in Higher-Order Polynomial Feature Space for Speech Recognition

M. Tahir (1), H. Huang (1), Ralf Schlüter (1), Hermann Ney (1), Louis ten Bosch (2), Bert Cranen (2), Lou Boves (2)

(1) RWTH Aachen University, Germany
(2) Radboud Universiteit Nijmegen, The Netherlands

The use of higher-order polynomial acoustic features can improve the performance of automatic speech recognition. However, the dimensionality of the polynomial representation can be prohibitively large, making the training of acoustic models using polynomial features for large vocabulary ASR systems infeasible. This paper presents an iterative polynomial training framework for acoustic modeling, which recursively expands the current acoustic features into their second-order polynomial feature space. In each recursion the dimensionality is reduced by a linear projection, such that increasingly higher order polynomial information is incorporated while keeping the dimensionality of the acoustic models constant. Experimental results obtained for a large-vocabulary continuous speech recognition task show that the proposed method outperforms conventional mixture models.

Full Paper

Bibliographic reference.  Tahir, M. / Huang, H. / Schlüter, Ralf / Ney, Hermann / Bosch, Louis ten / Cranen, Bert / Boves, Lou (2013): "Training log-linear acoustic models in higher-order polynomial feature space for speech recognition", In INTERSPEECH-2013, 3352-3355.