11th Annual Conference of the International Speech Communication Association

Makuhari, Chiba, Japan
September 26-30. 2010

Language Model Cross Adaptation for LVCSR System Combination

Xunying Liu, Mark J. F. Gales, Phil C. Woodland

University of Cambridge, UK

State-of-the-art large vocabulary continuous speech recognition (LVCSR) systems often combine outputs from multiple sub-systems developed at different sites. Cross system adaptation can be used as an alternative to direct hypothesis level combination schemes such as ROVER. In normal cross adaptation it is assumed that useful diversity among systems exists only at acoustic level. However, complimentary features among complex LVCSR systems also manifest themselves in other layers of modelling hierarchy, e.g., subword and word level. It is thus interesting to also cross adapt language models (LM) to capture them. In this paper cross adaptation of multi-level LMs modelling both syllable and word sequences was investigated to improve LVCSR system combination. Significant error rate gains of 6.7% relative were obtained over ROVER and acoustic model only cross adaptation when combining 13 Chinese LVCSR sub-systems used in the 2010 DARPA GALE evaluation.

Full Paper

Bibliographic reference.  Liu, Xunying / Gales, Mark J. F. / Woodland, Phil C. (2010): "Language model cross adaptation for LVCSR system combination", In INTERSPEECH-2010, 342-345.