Highly accurate speech recognition with very low latency is a big challenge but also an important requirement for modern real-time speech recognition applications such as speech-to-speech translation. We attack this problem by proposing a highly effective and efficient streaming mode decoding scheme. A novel multi-layered feature streaming method is introduced to minimize truncation errors during streaming by optimizing look-ahead parameters. A set of speed-up algorithms are further proposed to speed up both Gaussian computation and graph search. Experiments show dramatic reduction in decoding latency using the proposed decoding scheme, with high recognition accuracy similar to utterance based decoding.
Bibliographic reference. Gu, Liang / Xue, Jian / Cui, Xiaodong / Gao, Yuqing (2008): "High-performance low-latency speech recognition via multi-layered feature streaming and fast Gaussian computation", In INTERSPEECH-2008, 2098-2101.