ISCA Archive Interspeech 2007
ISCA Archive Interspeech 2007

In-context phone posteriors as complementary features for tandem ASR

Hamed Ketabdar, Hervé Bourlard

In this paper, we present a method for integrating possible prior knowledge (such as phonetic and lexical knowledge), as well as acoustic context (e.g., the whole utterance) in the phone posterior estimation, and we propose to use the obtained posteriors as complementary posterior features in Tandem ASR configuration. These posteriors are estimated based on HMM state posterior probability definition (typically used in standard HMMs training). In this way, by integrating the appropriate prior knowledge and context, we enhance the estimation of phone posteriors. These new posteriors are called ‘in-context’ or HMM posteriors. We combine these posteriors as complementary evidences with the posteriors estimated from a Multi Layer Perceptron (MLP), and use the combined evidence as features for training and inference in Tandem configuration. This approach has improved the performance, as compared to using only MLP estimated posteriors as features in Tandem, on OGI Numbers , Conversational Telephone speech (CTS), and Wall Street Journal (WSJ) databases.

doi: 10.21437/Interspeech.2007-560

Cite as: Ketabdar, H., Bourlard, H. (2007) In-context phone posteriors as complementary features for tandem ASR. Proc. Interspeech 2007, 2069-2072, doi: 10.21437/Interspeech.2007-560

  author={Hamed Ketabdar and Hervé Bourlard},
  title={{In-context phone posteriors as complementary features for tandem ASR}},
  booktitle={Proc. Interspeech 2007},