ISCA Archive ICSLP 1998
ISCA Archive ICSLP 1998

Efficient computation of MMI neural networks for large vocabulary speech recognition systems

Jörg Rottland, Andre Ludecke, Gerhard Rigoll

This paper describes, how to train Maximum Mutual Information Neural Networks (MMINN) in an efficient way, with a new topology. Large vocabulary speech recognition systems, based on a Hybrid MMI/connectionist HMM combination, have shown good performance on several tasks (RM and WSJ). MMINNs are trained to maximize the mutual information between the index of the winning output neuron (Winner-Takes-All network) and the phonetical class of the corresponding acoustic frame. One major problem of MMI-neural networks is the high computational effort, which is needed for the training of the neural networks. The computational effort is proportional to the input and output size of the neural network and to the number of training samples. This paper shows two approaches, that demonstrate, how these long training times can be reduced with very low or even no loss in recognition accuracy. This is achieved by the use of phonetical knowledge, to build a network topology based on phonetical classes.


doi: 10.21437/ICSLP.1998-413

Cite as: Rottland, J., Ludecke, A., Rigoll, G. (1998) Efficient computation of MMI neural networks for large vocabulary speech recognition systems. Proc. 5th International Conference on Spoken Language Processing (ICSLP 1998), paper 0331, doi: 10.21437/ICSLP.1998-413

@inproceedings{rottland98_icslp,
  author={Jörg Rottland and Andre Ludecke and Gerhard Rigoll},
  title={{Efficient computation of MMI neural networks for large vocabulary speech recognition systems}},
  year=1998,
  booktitle={Proc. 5th International Conference on Spoken Language Processing (ICSLP 1998)},
  pages={paper 0331},
  doi={10.21437/ICSLP.1998-413}
}