This paper studies the generation of intralingual closed captions from automatic speech transcripts, with the aim to assess techniques for multi-genre captioning. Captions and subtitles greatly vary in form and content depending on the programs genres and subtitling styles, resulting for instance in significantly different compression rates and lexical content. Borrowing ideas from the multi-domain machine translation literature, we implement and contrast several adaptation methods on a diverse set of programs broadcast on the French public TV. Our results show that such multi-domain adaption techniques are effective and help to improve our automatic subtitling system.
Cite as: Buet, F., Yvon, F. (2021) Toward Genre Adapted Closed Captioning. Proc. Interspeech 2021, 4403-4407, doi: 10.21437/Interspeech.2021-1762
@inproceedings{buet21_interspeech, author={François Buet and François Yvon}, title={{Toward Genre Adapted Closed Captioning}}, year=2021, booktitle={Proc. Interspeech 2021}, pages={4403--4407}, doi={10.21437/Interspeech.2021-1762} }