The use of higher-order polynomial acoustic features can improve the performance of automatic speech recognition. However, the dimensionality of the polynomial representation can be prohibitively large, making the training of acoustic models using polynomial features for large vocabulary ASR systems infeasible. This paper presents an iterative polynomial training framework for acoustic modeling, which recursively expands the current acoustic features into their second-order polynomial feature space. In each recursion the dimensionality is reduced by a linear projection, such that increasingly higher order polynomial information is incorporated while keeping the dimensionality of the acoustic models constant. Experimental results obtained for a large-vocabulary continuous speech recognition task show that the proposed method outperforms conventional mixture models.
Bibliographic reference. Tahir, M. / Huang, H. / Schlüter, Ralf / Ney, Hermann / Bosch, Louis ten / Cranen, Bert / Boves, Lou (2013): "Training log-linear acoustic models in higher-order polynomial feature space for speech recognition", In INTERSPEECH-2013, 3352-3355.