8th Annual Conference of the International Speech Communication Association

Antwerp, Belgium
August 27-31, 2007

Fusion of Contrastive Acoustic Models for Parallel Phonotactic Spoken Language Identification

Khe Chai Sim, Haizhou Li

Institute for Infocomm Research, Singapore

This paper investigates combining contrastive acoustic models for parallel phonotactic language identification systems. PRLM, a typical phonotactic system, uses a phone recogniser to extract phonotactic information from the speech data. Combining multiple PRLM systems together forms a Parallel PRLM (PPRLM) system. A standard PPRLM system utilises multiple phone recognisers trained on different languages and phone sets to provide diversification. In this paper, a new approach for PPRLM is proposed where phone recognisers with different acoustic models are used for the parallel systems. The STC and SPAM precision matrix modelling schemes as well as the MMI training criterion are used to produce contrastive acoustic models. Preliminary experimental results are reported on the NIST language recognition evaluation sets. With only two training corpora, a 12-way PPRLM system, using different acoustic modelling schemes, outperformed the standard 2-way PPRLM system by 2.0-5.0% absolute EER.

Full Paper

Bibliographic reference.  Sim, Khe Chai / Li, Haizhou (2007): "Fusion of contrastive acoustic models for parallel phonotactic spoken language identification", In INTERSPEECH-2007, 170-173.