Sixth European Conference on Speech Communication and Technology

Budapest, Hungary
September 5-9, 1999

Acoustics-Based Baseform Generation with Pronunciation and/or Phonotactic Models

Bhuvana Ramabhadran, Sabine Deligne, Abraham Ittycheriah

IBM T. J. Watson Research Center Yorktown Heights, NY, USA

In this paper, we describe a method to derive a phonetic pronunciation of a word using only an acoustic utterance of that word without a priori knowledge of the spelling of the word. In [5] and [6], we used a pronunciation model based on bigram statistics. Bi-gram statistics only constrain the left neighbor phone and results in phone sequences that are only pairwise appropriate. Here, we apply a pronunciation model in combination with a phonotactic model that serves the purpose of a language model to constrain the phone sequences produced. Error rates with and without the phonotactic model are presented.

Full Paper (PDF)   Gnu-Zipped Postscript

Bibliographic reference.  Ramabhadran, Bhuvana / Deligne, Sabine / Ittycheriah, Abraham (1999): "Acoustics-based baseform generation with pronunciation and/or phonotactic models", In EUROSPEECH'99, 507-510.