EUROSPEECH 2003 - INTERSPEECH 2003
Many important speech recognition tasks feature an open, constantly changing vocabulary. (E.g. broadcast news transcription, spoken document retrieval, ... ) Recognition of (new) words requires acoustic baseforms for them to be known. Commonly words are transcribed manually, which poses a major burden on vocabulary adaptation and inter-domain portability. In this work we investigate the possibility of applying a data-driven grapheme-to-phoneme converter to obtain the necessary phonetic transcriptions. Experiments were carried out on English and German speech recognition tasks. We study the relation between transcription quality and word error rate and show that manual transcription effort can be reduced significantly by this method with acceptable loss in performance.
Bibliographic reference. Bisani, M. / Ney, Hermann (2003): "Multigram-based grapheme-to-phoneme conversion for LVCSR", In EUROSPEECH-2003, 933-936.