7th International Conference on Spoken Language Processing

September 16-20, 2002
Denver, Colorado, USA

Speech Recognition with a Re-Speak Method for Subtitling Live Broadcasts

Toru Imai, Atsushi Matsui, Shinichi Homma, Takeshi Kobayakawa, Kazuo Onoe, Shoei Sato, Akio Ando

NHK, Japan

This paper describes a "re-speak" method for subtitling live TV broadcasts using a speech recognition system. Original on-location speech in live sport or music programs contains background noise, spontaneous or emotional speech, and the voices of speakers unknown to the recognition system, all of which cause recognition performance to deteriorate. However, if a different individual, to which the system has been adapted, carefully rephrases the original utterances in a studio, these problems can be largely overcome. Recognition experiments showed that rephrasing the commentary was effective in reducing perplexities and word error rates compared with simply repeating it. Speech recognition using the re-speak method was applied in practice to a music-based variety show and the 2002 Winter Olympic Games in order automatically to produce simultaneous subtitles for hearing-impaired viewers. A word error rate below 5% and a subtitle display delay time below three seconds were achieved.

Full Paper

Bibliographic reference.  Imai, Toru / Matsui, Atsushi / Homma, Shinichi / Kobayakawa, Takeshi / Onoe, Kazuo / Sato, Shoei / Ando, Akio (2002): "Speech recognition with a re-speak method for subtitling live broadcasts", In ICSLP-2002, 1757-1760.