In a multi-domain spoken dialogue system, a user's utterances are more prone to be out-of-grammar, because this kind of system deals with more tasks than a single-domain system. We defined a topic as a domain about which users want to find more information, and we developed a method of recovering out-of-grammar utterances based on topic estimation, i.e., by providing a help message in the estimated domain. Moreover, the domain extensibility, that is, to facilitate adding new domains, should be inherently retained in multi-domain systems. We therefore collected documents from the Web as training data for topic estimation. Because the data contained not a few noises, we used Latent Semantic Mapping (LSM), which enables robust topic estimation by removing the effect of noise from the data. The experimental results based on using 272 utterances collected with a Woz-like method showed that our method increased the topic estimation accuracy by 23.1 points from the baseline.
Bibliographic reference. Ikeda, Satoshi / Komatani, Kazunori / Ogata, Tetsuya / Okuno, Hiroshi G. (2007): "Topic estimation with domain extensibility for guiding user's out-of-grammar utterances in multi-domain spoken dialogue systems", In INTERSPEECH-2007, 2561-2564.