This paper proposes a new training framework for mixed labeled and unlabeled data and evaluates it on the task of binary phonetic classification. Our training objective function combines Maximum Mutual Information (MMI) for labeled data and Maximum Likelihood (ML) for unlabeled data. Through the modified training objective, MMI estimates are smoothed with ML estimates obtained from unlabeled data. On the other hand, our training criterion can also help the existing model adapt to new speech characteristics from unlabeled speech. In our experiments of phonetic classification, there is a consistent reduction of error rate from MLE to MMIE with I-smoothing, and then to MMIE with unlabeled-smoothing. Error rates can be further reduced by transductive-MMIE. We also experimented with the gender-mismatched case, in which the best improvement shows MMIE with unlabeled data has a 9.3% absolute lower error rate than MLE and a 2.35% absolute lower error rate than MMIE with I-smoothing.
Bibliographic reference. Huang, Jui-Ting / Hasegawa-Johnson, Mark (2008): "Maximum mutual information estimation with unlabeled data for phonetic classification", In INTERSPEECH-2008, 952-955.