Sixth European Conference on Speech Communication and Technology
In this paper, we describe a method to derive a phonetic pronunciation of a word using only an acoustic utterance of that word without a priori knowledge of the spelling of the word. In  and , we used a pronunciation model based on bigram statistics. Bi-gram statistics only constrain the left neighbor phone and results in phone sequences that are only pairwise appropriate. Here, we apply a pronunciation model in combination with a phonotactic model that serves the purpose of a language model to constrain the phone sequences produced. Error rates with and without the phonotactic model are presented.
Full Paper (PDF) Gnu-Zipped Postscript
Bibliographic reference. Ramabhadran, Bhuvana / Deligne, Sabine / Ittycheriah, Abraham (1999): "Acoustics-based baseform generation with pronunciation and/or phonotactic models", In EUROSPEECH'99, 507-510.