ISCA Archive ICSLP 2000
ISCA Archive ICSLP 2000

Building stochastic language model networks based on simultaneous word/phrase clustering

Kallirroi Georgila, Nikos Fanotakis, George Kokkinakis

In this paper we present a novel method for creating stochastic networks for language modelling in spoken dialogue systems. This is accomplished by taking a set of sentences (created manually, derived from simulation experiments, from using the system itself, the application grammar, or a combination of these methods), and training a Hidden Markov Model (HMM), which incorporates all the information about the structure of these sentences. Our technique has the great advantage that during the creation of the HMM, classes containing words or phrases with semantic-syntactic similarities are formed automatically. After all the training data has been used the final HMM is transformed to a stochastic network. The states and observations of the HMM correspond to the word/phrase classes and words/phrases respectively. The nodes of the stochastic network are the word/phrase classes and the arcs are the state transition probabilities of the HMM. The observation probabilities of the HMM correspond to the probabilities within the classes of the stochastic network. Our method has been tested using data from an Interactive telephone-based Directory Assistance Services system and a call-routing spoken dialogue system and has shown the expected advantages.


Cite as: Georgila, K., Fanotakis, N., Kokkinakis, G. (2000) Building stochastic language model networks based on simultaneous word/phrase clustering. Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000), vol. 1, 122-125

@inproceedings{georgila00_icslp,
  author={Kallirroi Georgila and Nikos Fanotakis and George Kokkinakis},
  title={{Building stochastic language model networks based on simultaneous word/phrase clustering}},
  year=2000,
  booktitle={Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000)},
  pages={vol. 1, 122-125}
}