EUROSPEECH 2001 Scandinavia
To improve the performance and the usability of the speech recognition devices, It is necessary for most applications to allow users to enter new words or personalize words to the system vocabulary. Voice-tagging technique is a simple example that use speaker dependent spoken sample to generate baseform transcriptions of the spoken words. More sophisticated techniques can use both spoken samples and texts of the new words to generate baseform transcriptions. In this paper, we propose a new approach to the problem. We use Bayesian networks to model the letter-to-sound rule probabilities. Compared to the common decision tree based method, This new approach shows a definite advantage.
Bibliographic reference. Ma, Changxue / Randolph, Mark A. (2001): "An approach to automatic phonetic baseform generation based on Bayesian networks", In EUROSPEECH-2001, 1453-1457.