16th Annual Conference of the International Speech Communication Association

Dresden, Germany
September 6-10, 2015

The Cambridge University 2014 BOLT Conversational Telephone Mandarin Chinese LVCSR System for Speech Translation

Xunying Liu, Federico Flego, Linlin Wang, C. Zhang, Mark J. F. Gales, Philip C. Woodland

University of Cambridge, UK

This paper presents the development of the 2014 Cambridge University conversational telephone Mandarin Chinese LVCSR system for the DARPA BOLT speech translation evaluation. A range of advanced modelling techniques were employed to both improve the recognition performance and provide a suitable integration with the translation system. These include an improved system combination technique using frame level acoustic model combination via joint decoding. Sequence trained deep neural network (DNN) based hybrid and tandem systems were combined on-the-fly to produce a consistent decoding output during search. A multi-level paraphrastic recurrent neural network LM (RNNLM) modelling both alternative paraphrase expressions and character sequences while preserving a consistent character to word segmentation was also used. This system gave an overall character error rate (CER) of 29.1% on the BOLT dev14 development set.

Full Paper

Bibliographic reference.  Liu, Xunying / Flego, Federico / Wang, Linlin / Zhang, C. / Gales, Mark J. F. / Woodland, Philip C. (2015): "The cambridge university 2014 BOLT conversational telephone Mandarin Chinese LVCSR system for speech translation", In INTERSPEECH-2015, 3145-3149.