A key issue in a spoken dialogue system is the successful semantic interpretation of the output from the speech recognizer. Extracting the semantic concepts, i.e. the meaningful phrases, of an utterance is traditionally performed using rule based methods. In this paper we describe a statistical framework for modeling (and decoding) semantic concepts based on discrete hidden Markov models (DHMMs). Each semantic concept class is modeled as a multi-state DHMM, where the observations are the recognized words. The proposed decoding procedure is capable of parsing an utterance into a sequence of phrases, each belonging to a different concept class. The phrase sequence will correspond to a concept segmentation and class identification, whilst the semantic entities constituting each phrase contain the semantic value.
The algorithm has been tested on a dialogue system for bus route information in Norwegian. The results confirm the applicability of the procedure. Semantically relevant concepts in input inquiries could be identified with 6.9% error rate on the sentence level. The corresponding segmentation error rate was 8.6% when concept segmentation information was available during training. Without this information, i.e. if the training was performed in an embedded mode, the segmentation error rate increased to 23.5%.
Cite as: Johnsen, M.H., Holter, T., Svendsen, T., Harborg, E. (2000) Stochastic modeling of semantic content for use IN a spoken dialogue system. Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000), vol. 1, 218-221, doi: 10.21437/ICSLP.2000-54
@inproceedings{johnsen00_icslp, author={Magne H. Johnsen and Trym Holter and Torbjørn Svendsen and Erik Harborg}, title={{Stochastic modeling of semantic content for use IN a spoken dialogue system}}, year=2000, booktitle={Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000)}, pages={vol. 1, 218-221}, doi={10.21437/ICSLP.2000-54} }