ISCA Archive ICSLP 1998
ISCA Archive ICSLP 1998

On the use of automatic speech recognition for TV captioning

Jordi Robert-Ribes

This study analyses the possible use of automatic speech recognition (ASR) for the automatic captioning of TV programs. Captioning requires: (1) transcribing the spoken words and (2) determining the times at which the caption has to appear and disappear on the screen. These times have to match as closely as possible the corresponding times on the audio signal. Automatic speech recognition can be used to determine both aspects: the spoken words and their times. This paper focuses on the question: would perfect automatic speech recognition systems be able to automate the captioning process? We present quantitative data on the discrepancy between the audio signal and the manually generated captions. We show how ASR alone can even lower the efficiency of captioning. The techniques needed to automate the captioning process are presented.


doi: 10.21437/ICSLP.1998-700

Cite as: Robert-Ribes, J. (1998) On the use of automatic speech recognition for TV captioning. Proc. 5th International Conference on Spoken Language Processing (ICSLP 1998), paper 0621, doi: 10.21437/ICSLP.1998-700

@inproceedings{robertribes98_icslp,
  author={Jordi Robert-Ribes},
  title={{On the use of automatic speech recognition for TV captioning}},
  year=1998,
  booktitle={Proc. 5th International Conference on Spoken Language Processing (ICSLP 1998)},
  pages={paper 0621},
  doi={10.21437/ICSLP.1998-700}
}