INTERSPEECH 2013
14thAnnual Conference of the International Speech Communication Association

Lyon, France
August 25-29, 2013

Discriminative Training of a Phoneme Confusion Model for a Dynamic Lexicon in ASR

Penny Karanasou, François Yvon, Thomas Lavergne, Lori Lamel

LIMSI, France

To enhance the recognition lexicon, it is important to be able to add pronunciation variants while keeping the confusability introduced by the extra phonemic variation low. However, this confusability is not easily correlated with the ASR performance, as it is an inherent phenomenon of speech. This paper proposes a method to construct a multiple pronunciation lexicon with a high discriminability. To do so, a phoneme confusion model is used to expand the phonemic search space of pronunciation variants during ASR decoding and a discriminative framework is adopted for the training of the weights of the phoneme confusions. For the parameter estimation, two training algorithms are implemented, the perceptron and the CRF model, using finite state transducers. Experiments on English data were conducted using a large stateof- the-art ASR system of continuous speech.

Full Paper

Bibliographic reference.  Karanasou, Penny / Yvon, François / Lavergne, Thomas / Lamel, Lori (2013): "Discriminative training of a phoneme confusion model for a dynamic lexicon in ASR", In INTERSPEECH-2013, 1966-1970.