5th International Conference on Spoken Language Processing

Sydney, Australia
November 30 - December 4, 1998

Effective Structural Adaptation of LVCSR Systems to Unseen Domains Using Hierarchical Connectionist Acoustic Models

Jürgen Fritsch (1), Michael Finke (2), Alex Waibel (3)

(1) Interactive Systems Labs, University of Karlsruhe, Germany
(2) Interactive Systems Labs, Carnegie Mellon University, USA
(3) Interactive Systems Inc, USA

We present an approach to efficiently and effectively downsize and adapt the structure of large vocabulary conversational speech recognition (LVCSR) systems to unseen domains, requiring only small amounts of transcribed adaptation data. Our approach aims at bringing todays mostly task dependent systems closer to the aspired goal of domain independence. To achieve this, we rely on the ACID/HNN framework, a hierarchical connectionist modeling paradigm which allows to dynamically adapt a tree structured modeling hierarchy to differing specificity of phonetic context in new domains. Experimental validation of the proposed approach has been carried out by adapting size and structure of ACID/HNN based acoustic models trained on Switchboard to two quite different, unseen domains, Wall Street Journal and an English Spontaneous Scheduling Task. In both cases, our approach yields considerably downsized acoustic models with performance improvements of up to 18% over the unadapted baseline models.

Full Paper

Bibliographic reference.  Fritsch, Jürgen / Finke, Michael / Waibel, Alex (1998): "Effective structural adaptation of LVCSR systems to unseen domains using hierarchical connectionist acoustic models", In ICSLP-1998, paper 0754.