![]() |
INTERSPEECH 2011
|
![]() |
This paper introduces the design of multilingual spoken term detection (STD) system using CALLHOME and CALLFRIEND multilingual databases published by Linguistic Data Consortium. For our experiments seven languages namely Arabic, English, German, Japanese, Korean, Chinese Mandarin and Spanish, are used to train and evaluate the STD system.
As the core module of our language general STD system, the multilingual automatic speech recogniser combines the acoustic and language models of seven languages into an uniform model set. A lot of our works are focused on the comparison of multilingual acoustic models . the conventional global phoneme set (GPS) based method and the recently proposed subspace GMM (SGMM) method [1] are investigated in detail. The experimental results demonstrate the viability of our multilingual STD system. It is shown that the resulting multilingual system not only supports seven different languages but also gives satisfying performance gains over the monolingual systems.
Bibliographic reference. Ma, Zejun / Wang, Xiaorui / Xu, Bo (2011): "An empirical study of multilingual spoken term detection", In INTERSPEECH-2011, 1921-1924.