Interspeech'2005 - Eurospeech
This paper presents an approach which integrates layer concept information into the trigram language model in order to improve the understanding accuracy for spoken dialogue systems and to improve the portability of the language modeling materials among different narrow-domain applications. With this approach, both the recognition accuracy and out-of-grammar problem can be largely improved, and the concept error rate is therefore reduced. In the experiments, using real-world air-ticket information spoken dialogue system for Mandarin Chinese, the relative concept error rate reductions from 20% to 30% are observed among systems given different sizes of language model training data. Furthermore, the layered N-gram modeling approach provides an efficient way of using existing chunkphrase corpora to build a new application, so as to improve the portability of the language modeling materials. Our experiment shows that the use of time chunk-phrases from a similar domain can achieve about 90% of the concept errorrate reduction compared to that of the in-domain collected training data. It shows an initial N-gram model might be established rapidly with the help of a library of chunk-phrase corpora before exhaustively collecting and transcribing application-specific dialogue utterances.
Bibliographic reference. Wang, Nick J. C. (2005): "Spoken language understanding using layered n-gram modeling", In INTERSPEECH-2005, 3429-3432.