ISCA Archive Interspeech 2009
ISCA Archive Interspeech 2009

Online detecting end times of spoken utterances for synchronization of live speech and its transcripts

Jie Gao, Qingwei Zhao, Yonghong Yan

In this paper, we present our initial efforts in the task of Automatically Synchronizing live spoken Utterances with their Transcripts (textual contents) (ASUT). We address the problem of online detecting of the end time of a spoken utterance given its textual content, which is one of the key problems of the ASUT task. A framesynchronous likelihood ratio test (FS-LRT) procedure is proposed and explored under the hidden Markov model (HMM) framework. The property of FS-LRT is studies empirically. Experiments indicate that our proposed approach shows satisfying performance. In addition, the proposed procedure has been successfully applied in a subtitling system for live broadcast news.


doi: 10.21437/Interspeech.2009-605

Cite as: Gao, J., Zhao, Q., Yan, Y. (2009) Online detecting end times of spoken utterances for synchronization of live speech and its transcripts. Proc. Interspeech 2009, 2115-2118, doi: 10.21437/Interspeech.2009-605

@inproceedings{gao09_interspeech,
  author={Jie Gao and Qingwei Zhao and Yonghong Yan},
  title={{Online detecting end times of spoken utterances for synchronization of live speech and its transcripts}},
  year=2009,
  booktitle={Proc. Interspeech 2009},
  pages={2115--2118},
  doi={10.21437/Interspeech.2009-605}
}