7th International Conference on Spoken Language Processing
September 16-20, 2002
We present a fully data-driven, language independent way of building a grapheme-to-phoneme converter. We apply the joint-multigram approach to the alignment problem and use standard language modelling techniques to model transcription probabilities. We study model parameters, training procedures and effects of corpus size in detail. Experiments were conducted on English and German pronunciation lexica. Our proposed training scheme performs better than previously published ones. Phoneme error rates as low as 3.98%for English and 0.51% for German were achieved.
Bibliographic reference. Bisani, M. / Ney, Hermann (2002): "Investigations on joint-multigram models for grapheme-to-phoneme conversion", In ICSLP-2002, 105-108.