Machine learning methods for grapheme-to-phoneme (G2P) conversion are popular, but the features used in the literature are most often simply a window of context letters, despite the availability of other features. In this paper, a set of features beyond the sevenletter window, termed non-standard features, are systematically evaluated for American English, using decision trees. The results show that adding non-standard features to a seven-letter window gives clear improvements for English, with the most important features being the previous three phone sequences predicted, an initial prediction of lexical stress location, and a window of vowel letters around the current letter.
Bibliographic reference. Webster, Gabriel / Braunschweiler, Norbert (2008): "An evaluation of non-standard features for grapheme-to-phoneme conversion", In INTERSPEECH-2008, 1845-1848.