ISCA Archive Interspeech 2008
ISCA Archive Interspeech 2008

Multi-stream spectro-temporal features for robust speech recognition

Sherry Y. Zhao, Nelson Morgan

A multi-stream approach to utilizing the inherently large number of spectro-temporal features for speech recognition is investigated in this study. Instead of reducing the feature-space dimension, this method divides the features into streams so that each represents a patch of information in the spectro-temporal response field. When used in combination with MFCCs for speech recognition under both clean and noisy conditions, multi-stream spectro-temporal features provide roughly a 30% relative improvement in word-error rate over using MFCCs alone. The result suggests that the multistream approach may be an effective way to handle and utilize spectro-temporal features for speech applications.


doi: 10.21437/Interspeech.2008-209

Cite as: Zhao, S.Y., Morgan, N. (2008) Multi-stream spectro-temporal features for robust speech recognition. Proc. Interspeech 2008, 898-901, doi: 10.21437/Interspeech.2008-209

@inproceedings{zhao08_interspeech,
  author={Sherry Y. Zhao and Nelson Morgan},
  title={{Multi-stream spectro-temporal features for robust speech recognition}},
  year=2008,
  booktitle={Proc. Interspeech 2008},
  pages={898--901},
  doi={10.21437/Interspeech.2008-209}
}