EUROSPEECH 2003 - INTERSPEECH 2003
The Aurora 2 database may be used as a benchmark for evaluation of algorithms under noisy conditions. In particular, the clean training/noisy test mode is aimed at evaluating models that are trained on clean data only without further adjustments on the noisy data, i.e. under severe mismatch between the training and test conditions. While several researchers proposed techniques at the front-end level to improve recognition performance over the reference hideen Markov model (HMM) baseline, investigations at the back-end level are sought. In this respect, the goal is to develop acoustic models that are intrinsically less noise sensitive. This paper presents the word accuracy yielded by a non-parametric HMM with connectionist estimates of the emission probabilities, i.e. a neural network is applied instead of the usual parametric (Gaussian mixture) probability densities. A regularization technique, relying on a maximum-likelihood parameter grouping algorithm, is explicitly introduced to increase the generalization capability of the model and, in turn, its noise-robustness. Results show that a 15,43% relative word error rate reduction w.r.t. the Gaussianmixture HMM is obtained by averaging over the different noises and SNRs of Aurora 2 test set A.
Bibliographic reference. Trentin, Edmondo / Matassoni, Marco / Gori, Marco (2003): "Evaluation on the Aurora 2 database of acoustic models that are less noise-sensitive", In EUROSPEECH-2003, 1805-1808.