8th Annual Conference of the International Speech Communication Association

Antwerp, Belgium
August 27-31, 2007

Automatic Transcription for a Web 2.0 Service to Search Podcasts

Jun Ogata, Masataka Goto, Kouichirou Eto

AIST, Japan

This paper describes speech recognition techniques that enable a Web 2.0 service "PodCastle" where users can search and read transcribed texts of podcasts, and correct recognition errors in those texts. Most previous speech recognizers had difficulties transcribing podcasts because podcasts include various kinds of contents recorded in different conditions and cover recent topics that tend to have many out-of-vocabulary words. To overcome such difficulties, we continuously improve speech recognizers by using information aggregated on the basis of Web 2.0. For example, a language model is adapted to a topic of the target podcast on the fly, the pronunciations of out-of-vocabulary words are obtained from a Web 2.0 service, and an acoustic model is trained by using the results of the error correction by anonymous users. The experiments we report in this paper show that our techniques produce promising results for podcasts.

Full Paper

Bibliographic reference.  Ogata, Jun / Goto, Masataka / Eto, Kouichirou (2007): "Automatic transcription for a web 2.0 service to search podcasts", In INTERSPEECH-2007, 2617-2620.