Sixth International Conference on Spoken Language Processing
In this paper, we investigate the feasibility of using machine learning method and subword units for spoken document categorization as an alternative to using words generated by word recognition or keyword spotting. An advantage of using subword acoustic unit representations to spoken document categorization is that it does not require prior knowledge about the contents of the spoken document and could attack the out of vocabulary (OOV) problem. The context-sensitive learning method is efficient on large, noisy corpora and very suitable for subword-based categorization. Given that even the best phone recognizers make a large number of mistakes, to improve phone N-gram recall, we can once again use phone lattices to obtain the bag of phone N-grams for each speech document. In this study, we examine a variety of subword unit categorization terms and measure their ability to perform effective categorization work, and also have investigated the performance when the underlying phonetic transcriptions contain different recognition errors.
Bibliographic reference. Qu, Weidong / Shirai, Katsuhiko (2000): "Using machine learning method and subword unit representations for spoken document categorization", In ICSLP-2000, vol.3, 1065-1068.