EUROSPEECH 2003 - INTERSPEECH 2003
8th European Conference on Speech Communication and Technology

Geneva, Switzerland
September 1-4, 2003

        

Active and Unsupervised Learning for Automatic Speech Recognition

Giuseppe Riccardi, Dilek Z. Hakkani-Tur

AT&T Labs-Research, USA

State-of-the-art speech recognition systems are trained using human transcriptions of speech utterances. In this paper, we describe a method to combine active and unsupervised learning for automatic speech recognition (ASR). The goal is to minimize the human supervision for training acoustic and language models and to maximize the performance given the transcribed and untranscribed data. Active learning aims at reducing the number of training examples to be labeled by automatically processing the unlabeled examples, and then selecting the most informative ones with respect to a given cost function. For unsupervised learning, we utilize the remaining untranscribed data by using their ASR output and word confidence scores. Our experiments show that the amount of labeled data needed for a given word accuracy can be reduced by 75% by combining active and unsupervised learning.

Full Paper

Bibliographic reference.  Riccardi, Giuseppe / Hakkani-Tur, Dilek Z. (2003): "Active and unsupervised learning for automatic speech recognition", In EUROSPEECH-2003, 1825-1828.