11th Annual Conference of the International Speech Communication Association

Makuhari, Chiba, Japan
September 26-30. 2010

Semi-Automated Update of Automatic Transcription System for the Japanese National Congress

Yuya Akita, Masato Mimura, Graham Neubig, Tatsuya Kawahara

Kyoto University, Japan

Update of acoustic and language models is vital to maintain performance of automatic speech recognition (ASR) systems. To alleviate efforts for updating models, we propose a "semi-automated" framework for the ASR system of the Japanese National Congress. The framework consists of our speaking-style transformation (SST) and lightly-supervised training (LSV) approaches, which can automatically generate spoken-style training texts and labels from documents like meeting minutes. An experimental evaluation demonstrated that this update framework improved the ASR performance for the latest meeting data. We also address an estimation method of the ASR accuracy based on SST, which uses minutes as reference texts and does not require verbatim transcripts.

Full Paper

Bibliographic reference.  Akita, Yuya / Mimura, Masato / Neubig, Graham / Kawahara, Tatsuya (2010): "Semi-automated update of automatic transcription system for the Japanese national congress", In INTERSPEECH-2010, 338-341.