11th Annual Conference of the International Speech Communication Association

Makuhari, Chiba, Japan
September 26-30. 2010

Improved Topic Classification and Keyword Discovery Using an HMM-Based Speech Recognizer Trained Without Supervision

Man-Hung Siu, Herbert Gish, Arthur Chan, William Belfield

Raytheon BBN Technologies, USA

In our previous publication, we presented a new approach to HMM training, viz., training without supervision. We used an HMM trained without supervision for transcribing audio into self-organized units (SOUs) for the purpose of topic classification. In this paper we report improvements made to the system, including the use of context dependent acoustic models and lattice based features that together reduce the topic verification equal error rate from 12% to 7%. In addition to discussing the effectiveness of the SOU approach we describe how we analyzed some selected SOU n-grams and found that they were highly correlated with keywords, demonstrating the ability of the SOU technology to discover topic relevant keywords.

Full Paper

Bibliographic reference.  Siu, Man-Hung / Gish, Herbert / Chan, Arthur / Belfield, William (2010): "Improved topic classification and keyword discovery using an HMM-based speech recognizer trained without supervision", In INTERSPEECH-2010, 2838-2841.