EUROSPEECH 2003 - INTERSPEECH 2003
8th European Conference on Speech Communication and Technology

Geneva, Switzerland
September 1-4, 2003

        

Learning Linguistically Valid Pronunciations from Acoustic Data

Francoise Beaufays, Ananth Sankar, Shaun Williams, Mitch Weintraub

Nuance Communications, USA

We describe an algorithm to learn word pronunciations from acoustic data. The algorithm jointly optimizes the pronunciation of a word using (a) the acoustic match of this pronunciation to the observed data, and (b) how "linguistically reasonable" the pronunciation is. Variations of word pronunciations in the recognition dictionary (which was created by linguists), are used to train a model of whether new hypothesized pronunciations are reasonable or not. The algorithm is well-suited for proper name pronunciation learning. Experiments on a corporate name dialing database show 40% error rate reduction with respect to a letter-to-phone pronunciation engine.

Full Paper

Bibliographic reference.  Beaufays, Francoise / Sankar, Ananth / Williams, Shaun / Weintraub, Mitch (2003): "Learning linguistically valid pronunciations from acoustic data", In EUROSPEECH-2003, 2593-2596.