8th European Conference on Speech Communication and Technology

Geneva, Switzerland
September 1-4, 2003


A Latent Analogy Framework for Grapheme-to-Phoneme Conversion

Jerome R. Bellegarda

Apple Computer Inc., USA

Data-driven grapheme-to-phoneme conversion involves either (top-down) inductive learning or (bottom-up) pronunciation by analogy. As both approaches rely on local context information, they typically require some external linguistic knowledge, e.g., individual grapheme/phoneme correspondences. To avoid such supervision, this paper proposes an alternative solution, dubbed pronunciation by latent analogy, which adopts a more global definition of analogous events. For each out-of-vocabulary word, a neighborhood of globally relevant pronunciations is constructed through an appropriate data-driven mapping of its graphemic form. Phoneme transcription then proceeds via locally optimal sequence alignment and maximum likelihood position scoring. This method was successfully applied to the synthesis of proper names with a large diversity of origin.

Full Paper

Bibliographic reference.  Bellegarda, Jerome R. (2003): "A latent analogy framework for grapheme-to-phoneme conversion", In EUROSPEECH-2003, 2029-2032.