This paper presents the development of the 2014 Cambridge University conversational telephone Mandarin Chinese LVCSR system for the DARPA BOLT speech translation evaluation. A range of advanced modelling techniques were employed to both improve the recognition performance and provide a suitable integration with the translation system. These include an improved system combination technique using frame level acoustic model combination via joint decoding. Sequence trained deep neural network (DNN) based hybrid and tandem systems were combined on-the-fly to produce a consistent decoding output during search. A multi-level paraphrastic recurrent neural network LM (RNNLM) modelling both alternative paraphrase expressions and character sequences while preserving a consistent character to word segmentation was also used. This system gave an overall character error rate (CER) of 29.1% on the BOLT dev14 development set.
Bibliographic reference. Liu, Xunying / Flego, Federico / Wang, Linlin / Zhang, C. / Gales, Mark J. F. / Woodland, Philip C. (2015): "The cambridge university 2014 BOLT conversational telephone Mandarin Chinese LVCSR system for speech translation", In INTERSPEECH-2015, 3145-3149.