13th Annual Conference of the International Speech Communication Association

Portland, OR, USA
September 9-13, 2012

Performance Comparison of Training Algorithms for Semi-Supervised Discriminative Language Modeling

Erinç Dikici (1), Arda Çelebi (2), Murat Saraçlar (1)

(1) Department of Electrical and Electronics Engineering;
(2) Department of Computer Engineering;
Boğaziçi University, Istanbul, Turkey

Discriminative language modeling (DLM) has been shown to improve the accuracy of automatic speech recognition (ASR) systems, but it requires large amounts of both acoustic and text data for training. One way to overcome this is to use simulated hypotheses instead of real hypotheses for training, which is called semi-supervised training. In this study, we compare six different perceptron algorithms with the semisupervised training approach. We formulate the DLM both as a structured prediction and a reranking problem, optimizing different criteria in each. We find that ranking variants achieve similar or better word error rate (WER) reduction with respect to structured perceptrons when used with real, simulated, or a combination of such data.

Index Terms: discriminative training, semi-supervised learning, language modeling, hypothesis simulation, ranking perceptron

Full Paper

Bibliographic reference.  Dikici, Erinç / Çelebi, Arda / Saraçlar, Murat (2012): "Performance comparison of training algorithms for semi-supervised discriminative language modeling", In INTERSPEECH-2012, 206-209.