ISCA Archive Eurospeech 1999
ISCA Archive Eurospeech 1999

The AT&t large vocabulary conversational speech recognition system

Andrej Ljolje, Michael D. Riley, Donald M. Hindle

We describe the AT&T recognition system used in the DARPA Large Vocabulary Conversational Speech Recognition (LVCSR-98) evaluation. It is based on multi-pass rescoring of weighted Finite State Machines (FSMs) using progressively more accu-rate acoustic models. Acoustic models used in the system are all gender independent. They are based on three state context-dependent hidden Markov models using Gaussian mixtures. The recognition paradigm uses the baseline system to generate a set of word lattices. Subsequent passes use Vocal Tract Normaliza-tion (VTN), Maximum Likelihood Linear Regression (MLLR) adaptation and ROVER to further refine the recognition output. All the acoustic models (except for one of the additional models used in the ROVER experiments) employed models of alterna-tive pronunciations to improve recognition performance. The overall recognition word error rate on the LVCSR-98 evaluation set was 44.1 %.


doi: 10.21437/Eurospeech.1999-196

Cite as: Ljolje, A., Riley, M.D., Hindle, D.M. (1999) The AT&t large vocabulary conversational speech recognition system. Proc. 6th European Conference on Speech Communication and Technology (Eurospeech 1999), 807-810, doi: 10.21437/Eurospeech.1999-196

@inproceedings{ljolje99_eurospeech,
  author={Andrej Ljolje and Michael D. Riley and Donald M. Hindle},
  title={{The AT&t large vocabulary conversational speech recognition system}},
  year=1999,
  booktitle={Proc. 6th European Conference on Speech Communication and Technology (Eurospeech 1999)},
  pages={807--810},
  doi={10.21437/Eurospeech.1999-196}
}