ISCA Archive Interspeech 2008
ISCA Archive Interspeech 2008

High-performance low-latency speech recognition via multi-layered feature streaming and fast Gaussian computation

Liang Gu, Jian Xue, Xiaodong Cui, Yuqing Gao

Highly accurate speech recognition with very low latency is a big challenge but also an important requirement for modern real-time speech recognition applications such as speech-to-speech translation. We attack this problem by proposing a highly effective and efficient streaming mode decoding scheme. A novel multi-layered feature streaming method is introduced to minimize truncation errors during streaming by optimizing look-ahead parameters. A set of speed-up algorithms are further proposed to speed up both Gaussian computation and graph search. Experiments show dramatic reduction in decoding latency using the proposed decoding scheme, with high recognition accuracy similar to utterance based decoding.


doi: 10.21437/Interspeech.2008-544

Cite as: Gu, L., Xue, J., Cui, X., Gao, Y. (2008) High-performance low-latency speech recognition via multi-layered feature streaming and fast Gaussian computation. Proc. Interspeech 2008, 2098-2101, doi: 10.21437/Interspeech.2008-544

@inproceedings{gu08_interspeech,
  author={Liang Gu and Jian Xue and Xiaodong Cui and Yuqing Gao},
  title={{High-performance low-latency speech recognition via multi-layered feature streaming and fast Gaussian computation}},
  year=2008,
  booktitle={Proc. Interspeech 2008},
  pages={2098--2101},
  doi={10.21437/Interspeech.2008-544}
}