ISCA Archive Eurospeech 2001
ISCA Archive Eurospeech 2001

MAP combination of multi-stream HMM or HMM/ANN experts

Andrew Morris, Astrid Hagen, Hervé Bourlard

Automatic speech recognition (ASR) performance falls dramatically with the level of mismatch between training and test data. The human ability to recognise speech when a large proportion of frequencies are dominated by noise has inspired the "missing data" and "multi-band" approaches to noise robust ASR. "Missing data" ASR identifies low SNR spectral data in each data frame and then ignores it. Multi-band ASR trains a separate model for each position of missing data, estimates a reliability weight for each model, then combines model outputs in a weighted sum. A problem with both approaches is that local data reliability estimation is inherently inaccurate and also assumes that all of the training data was clean. In this article we present a model in which adaptive multi-band expert weighting is incorporated naturally into the maximum a posteriori (MAP) decoding process.


doi: 10.21437/Eurospeech.2001-79

Cite as: Morris, A., Hagen, A., Bourlard, H. (2001) MAP combination of multi-stream HMM or HMM/ANN experts. Proc. 7th European Conference on Speech Communication and Technology (Eurospeech 2001), 225-228, doi: 10.21437/Eurospeech.2001-79

@inproceedings{morris01_eurospeech,
  author={Andrew Morris and Astrid Hagen and Hervé Bourlard},
  title={{MAP combination of multi-stream HMM or HMM/ANN experts}},
  year=2001,
  booktitle={Proc. 7th European Conference on Speech Communication and Technology (Eurospeech 2001)},
  pages={225--228},
  doi={10.21437/Eurospeech.2001-79}
}