In our previous publication, we presented a new approach to HMM training, viz., training without supervision. We used an HMM trained without supervision for transcribing audio into self-organized units (SOUs) for the purpose of topic classification. In this paper we report improvements made to the system, including the use of context dependent acoustic models and lattice based features that together reduce the topic verification equal error rate from 12% to 7%. In addition to discussing the effectiveness of the SOU approach we describe how we analyzed some selected SOU n-grams and found that they were highly correlated with keywords, demonstrating the ability of the SOU technology to discover topic relevant keywords.
Bibliographic reference. Siu, Man-Hung / Gish, Herbert / Chan, Arthur / Belfield, William (2010): "Improved topic classification and keyword discovery using an HMM-based speech recognizer trained without supervision", In INTERSPEECH-2010, 2838-2841.