Applied Spoken Language Interaction in Distributed Environments (ASIDE 2005)

ITRW and COST278 Final Workshop
Aalborg, Denmark
November 10-11, 2005

Domain Adaptation of a Distributed Speech-To-Speech Translation System

Michael Stier (1), Stefan Feldes (2)

(1) Institute for Digital Signal Processing, Mannheim University of Applied Sciences, Germany
(2) T-Systems, Technology Center ENPS, Darmstadt, Germany

This paper is on the experiment to design, implement, and optimize a speech-to-speech translation system that is solely based on an appropriate combination of currently available commercial components for speech recognition, machine translation, and speech synthesis. Principal feasibility and performance improvement by domain adaptation have been investigated. We have chosen a distributed architecture to implement an experimental system supporting full-duplex communication. In parallel, it was analysed which kind of application domains are useful and suitable for the respective system infrastructure. For optimization we then investigated how the accuracy of speech recognition can be improved by adaptation to the chosen limited domain (e.g. hotel reservation). This was done by speaker adaptation of the acoustic model, and (more importantly) domain specific adaptation of the language model. Two approaches for LM adaptation were compared: statistical n-grams and context-free grammars. Evaluation by conversation tests shows significant improvements in both approaches. Word accuracy could be raised, e.g. from 75% to 92% using optimised n-grams and to 91% using CFG. Pros and cons with respect to overall system performance and applicability are discussed in detail.

Full Paper

Bibliographic reference.  Stier, Michael / Feldes, Stefan (2005): "Domain adaptation of a distributed speech-to-speech translation system", In ASIDE-2005, paper 10.