Second ESCA/IEEE Workshop on Speech Synthesis

September 12-15, 1994
Mohonk Mountain House, New Paltz, NY, USA

A Language-Independent, Data-Oriented Architecture for Grapheme-to-Phoneme Conversion

Walter Daelemans, Antal van den Bosch

University of Tilburg, The Netherlands

We report on an implemented grapheme-to-phoneme conversion architecture. Given a set of examples (spelling words with their associated phonetic representation) in a language, a grapheme-to-phoneme conversion system is automatically produced for that language which takes as its input the spelling of words, and produces as its output the phonetic transcription according to the rules implicit in the training data. This paper describes the architecture and focuses on our solution to the alignment problem: given the spelling and the phonetic trancription of a word (often differing in length), these two representations have to be aligned in such a way that grapheme symbols or strings of grapheme symbols are consistently associated with the same phonetic symbol. If this alignment has to be done by hand, it is extremely labour-intensive.

Full Paper

Bibliographic reference.  Daelemans, Walter / Bosch, Antal van den (1994): "A language-independent, data-oriented architecture for grapheme-to-phoneme conversion", In SSW2-1994, 199-202.