10th Annual Conference of the International Speech Communication Association

Brighton, United Kingdom
September 6-10, 2009

Incremental Dialog Clustering for Speech-to-Speech Translation

David Stallard, Stavros Tsakalidis, Shirin Saleem

BBN Technologies, USA

Application domains for speech-to-speech translation and dialog systems often contain sub-domains and/or task-types for which different outputs are appropriate for a given input. It would be useful to be able to automatically find such sub-domain structure in training corpora, and to classify new interactions with the system into one of these sub-domains. To this end, We present a document-clustering approach to such sub-domain classification, which uses a recently-developed algorithm based on von Mises Fisher distributions. We give preliminary perplexity reduction and MT performance results for a speech-to-speech translation system using this model.

Full Paper

Bibliographic reference.  Stallard, David / Tsakalidis, Stavros / Saleem, Shirin (2009): "Incremental dialog clustering for speech-to-speech translation", In INTERSPEECH-2009, 428-431.