ISCA Archive Interspeech 2005
ISCA Archive Interspeech 2005

Different size multilingual phone inventories and context-dependent acoustic models for language identification

Dong Zhu, Martine Adda-Decker, Fabien Antoine

Experimental work using phonotactic and syllabotactic approaches for automatic language identification (LID) is presented. Various questions have originated this research: what is the best choice for a multilingual phone inventory? Can a syllabic unit be of interest to extend the scope of the modeling unit? Are context-dependent (CD) acoustic models, widely used for speech recognition, able to improve LID accuracy? Can the multilingual acoustic models process efficiently additional languages, which are different from the training languages? The LID system is experimentally studied using different sizes of multilingual phone sets: 73, 50 and 35 phones. Experiments are carried out on broadcast news in seven languages (German, English, Arabic, Mandarin, Spanish, French, and Italian) with 140-hours audio data for training and 7 hours for testing. It is shown that smaller phone inventories achieve higher LID accuracy and that CD models outperform CI models. Further experiments have been conducted to test generality of both the multilingual acoustic model and phonotactics methods on another 11+10 languages corpus (11 known + 10 unknown languages).


doi: 10.21437/Interspeech.2005-717

Cite as: Zhu, D., Adda-Decker, M., Antoine, F. (2005) Different size multilingual phone inventories and context-dependent acoustic models for language identification. Proc. Interspeech 2005, 2833-2836, doi: 10.21437/Interspeech.2005-717

@inproceedings{zhu05d_interspeech,
  author={Dong Zhu and Martine Adda-Decker and Fabien Antoine},
  title={{Different size multilingual phone inventories and context-dependent acoustic models for language identification}},
  year=2005,
  booktitle={Proc. Interspeech 2005},
  pages={2833--2836},
  doi={10.21437/Interspeech.2005-717}
}