11th Annual Conference of the International Speech Communication Association

Makuhari, Chiba, Japan
September 26-30. 2010

Improved N-Gram Phonotactic Models for Language Recognition

Mohamed Faouzi BenZeghiba, Jean-Luc Gauvain, Lori Lamel

LIMSI, France

This paper investigates various techniques to improve the estimation of n-gram phonotactic models for language recognition using single-best phone transcriptions and phone lattices. More precisely, we first report on the impact of the so-called {it acoustic scale factor} on the system accuracy when using lattice-based training, and then we report on the use of n-gram cutoff and pruning techniques. Several system configurations are explored, such as the use of context-independent and context-dependent phone models, the use of single-best phone hypotheses versus phone lattices, and the use of various n-gram orders. Experiments are conducted using the LRE 2007 evaluation data and the results are reported using the a posteriori EER. The results show that the impact of these techniques on the system accuracy is highly dependent on the training conditions and that careful optimization can lead to performance improvements.

Full Paper

Bibliographic reference.  BenZeghiba, Mohamed Faouzi / Gauvain, Jean-Luc / Lamel, Lori (2010): "Improved n-gram phonotactic models for language recognition", In INTERSPEECH-2010, 2710-2713.