In this paper, class HMM, an acoustic model of multiple words in the same domain-dependent class, is proposed. The proposed class HMM is trained by a set of words falling into a class so that the model outputs a certain score when it is matched with any word included in the class. This direct modeling of a class provides a simple method to improve a FSN grammar which takes semantic constraint into account, in order to accept more word sequences without reducing constraint. The key idea of the method is to match input speech with a sequence of classes, not words, when the input speech does not match well with the possible word sequences. This can be simply realized by bypassing a word in the grammar network using a class model. Due to the rough acoustic modeling of class HMM, furthermore, it is expected that the bypass will not match better than the original path and will not reduce the constraint of the original grammar. The effectiveness of the method is evaluated by recognition experiments using the extension telephone exchange domain including 260 words, and the static branching factor of the original grammar is 12.6. The result shows that 43 % of the test sentences that original grammar can not accept are recognized as a correct sequence of classes and words, and can be properly processed by the dialogue module. Furthermore, no degradation of recognition accuracy is found in recognizing 560 sentences which can be accepted by the original grammar. These results show that the recognition error due to the unexpected input can be reduced by half, and that the proposed scheme is effective for spontaneous speech input.
Keywords: Spontaneous speech recognition, Network grammar
Bibliographic reference. Takeda, Kauzya / Inoue, Naomi / Kuroiwa, Shingo / Konuma, Tomohiro / Yamamoto, Seiichi (1993): "Improving robustness of network grammar by using class HMM", In EUROSPEECH'93, 1623-1626.