2nd Workshop on Spoken Language Technologies for Under-Resourced Languages

Universiti Sains, Penang, Malaysia
May 3-5, 2010

Autonomous Acoustic Model Adaptation for Multilingual Meeting Transcription Involving High- and Low-Resourced Languages

Sethserey Sam (1,2), Laurent Besacier (1), Eric Castelli (2), Bin Ma (3), Cheung-Chi Leung (3), Haizhou Li (3)

(1) LIG Laboratory, UMR CNRS 5524 BP 53, 38041 Grenoble Cedex 9, France
(2) MICA research center, UMI CNRS 2954, HUT, Hanoi, Vietnam
(3) Institute for Infocomm Research, Singapore

In speech technology, we found several challenges in automatic speech transcription system for multilingual conferences or meetings. Firstly, the dialog occurs between native and non-native speakers. Secondly, the non-native speakers come from different parts of the world (e.g., English spoken by native French speakers or English spoken by native Vietnamese speakers, etc.). Thirdly, no data or a limited amount of data is available to bootstrap the acoustic modeling. This paper presents some autonomous online and offline acoustic model adaptation approaches, which required no additional data in the adaptation process, to deal with above challenges as well as to improve the performance of the phone recognizers used for automatic transcription purpose. Experiments show that our adaptation approach (online interpolation with MLLR based on PRVSM) can provide about 4% absolute gain in Phone Accuracy Rate (PAR) compared to the multilingual baseline system and it is even better than the performance of the supervised monolingual systems.

Index Terms: ASR, multilingual acoustic modeling, language label voting, PR-VSM, MLLR.

Full Paper

Bibliographic reference.  Sam, Sethserey / Besacier, Laurent / Castelli, Eric / Ma, Bin / Leung, Cheung-Chi / Li, Haizhou (2010): "Autonomous acoustic model adaptation for multilingual meeting transcription involving high- and low-resourced languages", In SLTU-2010, 116-121.