Sixth International Conference on Spoken Language Processing (ICSLP 2000)

Beijing, China
October 16-20, 2000

Building Stochastic Language Model Networks Based on Simultaneous Word/Phrase Clustering

Kallirroi Georgila, Nikos Fanotakis, George Kokkinakis

Wire Communications Laboratory University of Patras, Greece

In this paper we present a novel method for creating stochastic networks for language modelling in spoken dialogue systems. This is accomplished by taking a set of sentences (created manually, derived from simulation experiments, from using the system itself, the application grammar, or a combination of these methods), and training a Hidden Markov Model (HMM), which incorporates all the information about the structure of these sentences. Our technique has the great advantage that during the creation of the HMM, classes containing words or phrases with semantic-syntactic similarities are formed automatically. After all the training data has been used the final HMM is transformed to a stochastic network. The states and observations of the HMM correspond to the word/phrase classes and words/phrases respectively. The nodes of the stochastic network are the word/phrase classes and the arcs are the state transition probabilities of the HMM. The observation probabilities of the HMM correspond to the probabilities within the classes of the stochastic network. Our method has been tested using data from an Interactive telephone-based Directory Assistance Services system and a call-routing spoken dialogue system and has shown the expected advantages.


Full Paper

Bibliographic reference.  Georgila, Kallirroi / Fanotakis, Nikos / Kokkinakis, George (2000): "Building stochastic language model networks based on simultaneous word/phrase clustering", In ICSLP-2000, vol.1, 122-125.