Speech Recognition and Intrinsic Variation (SRIV2006)
A technique is proposed for the adaptation of automatic speech recognition systems using Hybrid models combining Artificial Neural Networks with Hidden Markov Models.
We investigated in this paper the extension of the classical approach consisting in applying linear transformations not only to the input features, but also to the outputs of the internal layers. The motivation is that the outputs of an internal layer represent a projection of the input pattern into a space where it should be easier to learn the classification or transformation expected at the output of the network. To reduce the risk that the network focuses on new data only, losing its generalization capability (catastrophic forgetting), an original solution, Conservative Training, is proposed.
We illustrate the problem of catastrophic forgetting using an artificial test-bed, and apply our techniques to a set of adaptation tasks in the domain of Automatic Speech Recognition (ASR) based on Artificial Neural Networks.
We report on the adaptation potential of different techniques, and on the generalization capability of the adapted networks. The results show that the combination of the proposed approaches mitigates the catastrophic forgetting effects, and always outperforms the use of the classical linear transformation in the feature space.
Bibliographic reference. Scanzio, Stefano / Albesano, Dario / Gemello, Roberto / Laface, Pietro / Mana, Franco (2006): "Adapting hybrid ANN/HMM to speech variations", In SRIV-2006, 143-147.