ISCA Archive Interspeech 2006
ISCA Archive Interspeech 2006

Multi-stream ASR: an oracle perspective

Hemant Misra, Jithendra Vepa, Hervé Bourlard

Multi-stream based automatic speech recognition (ASR) systems are usually shown to outperform single stream systems, specially in noisy test conditions. And, indeed, there is a trend today in ASR towards using more and more acoustic features combined at the input (early integration, possibly preceded by some linear or nonlinear transformation) or later in the recognition process (e.g., at the level of likelihoods, then referred to as late integration). However, to guarantee optimal exploitation of such multi-stream systems, we need to use features that are as much complementary as possible, while also using the best combination method for those streams. In practice, it is never clear whether we fully exploit the potential of the available streams. This present paper investigates an ‘oracle’ test to provide some insight in these issues. Although not providing us with an absolute performance upper bound, oracle is shown to indicate the complimentary of the feature streams used, and to provide a reasonable reference target to evaluate combination strategies. The oracle analysis is supported by results obtained on Numbers95 database using different feature streams and entropy based combination method.

doi: 10.21437/Interspeech.2006-634

Cite as: Misra, H., Vepa, J., Bourlard, H. (2006) Multi-stream ASR: an oracle perspective. Proc. Interspeech 2006, paper 1663-Thu2CaP.3, doi: 10.21437/Interspeech.2006-634

  author={Hemant Misra and Jithendra Vepa and Hervé Bourlard},
  title={{Multi-stream ASR: an oracle perspective}},
  booktitle={Proc. Interspeech 2006},
  pages={paper 1663-Thu2CaP.3},