EUROSPEECH 2003 - INTERSPEECH 2003
In this paper, a real-time decoder for low-latency online speech transcription is presented. The system was developed within the Synface project, which aims to improve the possibilities for hard of hearing people to use conventional telephony by providing speech-synchronized multimodal feedback. This paper addresses the specific issues related to HMM-based incremental phone classification with real-time constraints. The decoding algorithm described in this work enables a trade-off to be made between improved recognition accuracy and reduced latency. By accepting a longer latency per output increment, more time can be ascribed to hypothesis look-ahead and by that improve classification accuracy. Experiments performed on the Swedish SpeechDat database show that it is possible to generate the same classification as is produced by non-incremental decoding using HTK, by adopting a latency of approx. 150 ms or more.
Bibliographic reference. Seward, Alexander (2003): "Low-latency incremental speech transcription in the synface project", In EUROSPEECH-2003, 1141-1144.