Sixth European Conference on Speech Communication and Technology
Word pronunciation can be learned by inductive machine learning algorithms when it is represented as a classification task: classify a letter within its local word context as mapping to its pronunciation. On the basis of generalization accuracy results from empirical studies, we argue that word pronunciation, particularly in complex spelling systems such as that of English, should not be modelled in a way that abstracts from exceptions. Learning methods such as decision tree and backpropagation learning, while trying to abstract from noise, also throw away alarge number of useful exceptional cases. Our empirical results suggest that a memory-based approach which stores all available word-pronunciation knowledge as cases in memory, and generalises from this lexicon via analogical reasoning, is at all times the optimal modelling method.
Full Paper (PDF) Gnu-Zipped Postscript
Bibliographic reference. Busser, Bertjan / Daelemans, Walter / Bosch, Antal van den (1999): "Machine learning of word pronunciation: the case against abstraction", In EUROSPEECH'99, 2123-2126.