Discriminative language modeling is a successful approach to improving speech recognition accuracy. However, it requires a large amount of spoken data and manually transcribed reference text for training. This paper proposes an unsupervised training method to overcome this handicap. The key idea is to use an error rate estimator, instead of calculating the true error rate from the reference. In standard supervised approaches, the true error rate is used only for finding the Oracle, the minimum error rate hypothesis, and for prioritizing the competing hypotheses for weighted learning. Namely, we really need the error rate, not the reference. In our proposed method, estimates of the error rate are used instead, and so the references are not necessary. Our experiments show that our proposed method can generate a model that performs to the same level of accuracy as supervised methods.
Bibliographic reference. Oba, Takanobu / Ogawa, Atsunori / Hori, Takaaki / Masataki, Hirokazu / Nakamura, Atsushi (2013): "Unsupervised discriminative language modeling using error rate estimator", In INTERSPEECH-2013, 1223-1227.