ISCA Archive Interspeech 2013
ISCA Archive Interspeech 2013

A lecture transcription system combining neural network acoustic and language models

Peter Bell, Hitoshi Yamamoto, Pawel Swietojanski, Youzheng Wu, Fergus McInnes, Chiori Hori, Steve Renals

This paper presents a new system for automatic transcription of lectures. The system combines a number of novel features, including deep neural network acoustic models using multi-level adaptive networks to incorporate out-of-domain information, and factored recurrent neural network language models. We demonstrate that the system achieves large improvements on the TED lecture transcription task from the 2012 IWSLT evaluation . our results are currently the best reported on this task, showing an relative WER reduction of more than 16% compared to the closest competing system from the evaluation.


doi: 10.21437/Interspeech.2013-673

Cite as: Bell, P., Yamamoto, H., Swietojanski, P., Wu, Y., McInnes, F., Hori, C., Renals, S. (2013) A lecture transcription system combining neural network acoustic and language models. Proc. Interspeech 2013, 3087-3091, doi: 10.21437/Interspeech.2013-673

@inproceedings{bell13_interspeech,
  author={Peter Bell and Hitoshi Yamamoto and Pawel Swietojanski and Youzheng Wu and Fergus McInnes and Chiori Hori and Steve Renals},
  title={{A lecture transcription system combining neural network acoustic and language models}},
  year=2013,
  booktitle={Proc. Interspeech 2013},
  pages={3087--3091},
  doi={10.21437/Interspeech.2013-673}
}