Sixth International Conference on Spoken Language Processing (ICSLP 2000)
October 16-20, 2000
Building Stochastic Language Model Networks Based on Simultaneous Word/Phrase Clustering
Kallirroi Georgila, Nikos Fanotakis, George Kokkinakis
Wire Communications Laboratory
University of Patras, Greece
In this paper we present a novel method for creating stochastic
networks for language modelling in spoken dialogue systems.
This is accomplished by taking a set of sentences (created
manually, derived from simulation experiments, from using the
system itself, the application grammar, or a combination of
these methods), and training a Hidden Markov Model (HMM),
which incorporates all the information about the structure of
these sentences. Our technique has the great advantage that
during the creation of the HMM, classes containing words or
phrases with semantic-syntactic similarities are formed
automatically. After all the training data has been used the final
HMM is transformed to a stochastic network. The states and
observations of the HMM correspond to the word/phrase classes
and words/phrases respectively.
The nodes of the stochastic
network are the word/phrase classes and the arcs are the state
transition probabilities of the HMM. The observation
probabilities of the HMM correspond to the probabilities within
the classes of the stochastic network. Our method has been
tested using data from an Interactive telephone-based Directory
Assistance Services system and a call-routing spoken dialogue
system and has shown the expected advantages.
Georgila, Kallirroi / Fanotakis, Nikos / Kokkinakis, George (2000):
"Building stochastic language model networks based on simultaneous word/phrase clustering",
In ICSLP-2000, vol.1, 122-125.