EUROSPEECH 2003 - INTERSPEECH 2003
8th European Conference on Speech Communication and Technology

Geneva, Switzerland
September 1-4, 2003

        

Multigram-Based Grapheme-to-Phoneme Conversion for LVCSR

M. Bisani, Hermann Ney

RWTH Aachen, Germany

Many important speech recognition tasks feature an open, constantly changing vocabulary. (E.g. broadcast news transcription, spoken document retrieval, ... ) Recognition of (new) words requires acoustic baseforms for them to be known. Commonly words are transcribed manually, which poses a major burden on vocabulary adaptation and inter-domain portability. In this work we investigate the possibility of applying a data-driven grapheme-to-phoneme converter to obtain the necessary phonetic transcriptions. Experiments were carried out on English and German speech recognition tasks. We study the relation between transcription quality and word error rate and show that manual transcription effort can be reduced significantly by this method with acceptable loss in performance.

Full Paper

Bibliographic reference.  Bisani, M. / Ney, Hermann (2003): "Multigram-based grapheme-to-phoneme conversion for LVCSR", In EUROSPEECH-2003, 933-936.