8th International Conference on Spoken Language Processing

Jeju Island, Korea
October 4-8, 2004

A Memory Efficient Grapheme-to-Phoneme Conversion System for Speech Processing

Jun Huang, Lex Olorenshaw, Gustavo Hernandez-Abrego, Lei Duan

Sony, USA

In this paper, a memory efficient, statistical data driven approach is proposed and succesfully tested for grapheme-to-phoneme (G2P) conversion. In our system, a dynamic programming (DP) based fast algorithm is formulated to estimate the optimal joint segmentation between training sequences of graphemes and phonemes. A statistical language model is trained to model the contextual information between grapheme and phoneme segments. A two-stage fast decoding algorithm is also proposed to recognize the most-likely phoneme sequences given the input test word and the n-gram graphone models. Experimental results show that this system has similar recognition accuracy as a decision-tree based G2P system and requires much less memory and processing time.

Full Paper

Bibliographic reference.  Huang, Jun / Olorenshaw, Lex / Hernandez-Abrego, Gustavo / Duan, Lei (2004): "A memory efficient grapheme-to-phoneme conversion system for speech processing", In INTERSPEECH-2004, 1237-1240.