In a growing number of applications, such as simultaneous interpretation, audio or text may be available conveying the same information in different languages. These different views contain redundant information that can be explored to enhance the performance of speech and language processing applications. We propose a method that directly integrates ASR word graphs or lattices and phrase tables from an SMT system to combine such parallel speech data and improve ASR performance. We apply this technique to speeches from four European Parliament committees and obtain a 16.6% relative improvement (20.8% after a second iteration) in WER, when Portuguese and Spanish interpreted versions are combined with the original English speeches. Our results indicate that further improvements may be possible by including additional languages.
Index Terms: multistream combination, speech recognition, machine translation
Bibliographic reference. Miranda, João / Neto, João Paulo da Silva / Black, Alan W. (2012): "Parallel combination of multilingual speech streams for improved ASR", In INTERSPEECH-2012, 1027-1030.