Automatic speech recognition (ASR) for multilingual audio contents, such as international conference recordings and broadcast news, is addressed. For handling such contents efficiently, a simultaneous ASR is promising. Conventionally, ASR has been performed independently, namely language by language, although multilingual speech, which consists of utterances in several languages representing the same meaning, is available. In this paper, we discuss a bilingual speech recognition framework based on statistical ASR and machine translation (MT) in which bilingual ASR is performed simultaneously and complementarily. Then, according to Japanese speech recognition with corresponding English text and MT, we shows the framework works well.
Bibliographic reference. Nanjo, Hiroaki / Oku, Yuichi / Yoshimi, Takehiko (2007): "Automatic speech recognition framework for multilingual audio contents", In INTERSPEECH-2007, 1445-1448.