ISCA Archive Interspeech 2007
ISCA Archive Interspeech 2007

Fusion of contrastive acoustic models for parallel phonotactic spoken language identification

Khe Chai Sim, Haizhou Li

This paper investigates combining contrastive acoustic models for parallel phonotactic language identification systems. PRLM, a typical phonotactic system, uses a phone recogniser to extract phonotactic information from the speech data. Combining multiple PRLM systems together forms a Parallel PRLM (PPRLM) system. A standard PPRLM system utilises multiple phone recognisers trained on different languages and phone sets to provide diversification. In this paper, a new approach for PPRLM is proposed where phone recognisers with different acoustic models are used for the parallel systems. The STC and SPAM precision matrix modelling schemes as well as the MMI training criterion are used to produce contrastive acoustic models. Preliminary experimental results are reported on the NIST language recognition evaluation sets. With only two training corpora, a 12-way PPRLM system, using different acoustic modelling schemes, outperformed the standard 2-way PPRLM system by 2.0-5.0% absolute EER.


doi: 10.21437/Interspeech.2007-72

Cite as: Sim, K.C., Li, H. (2007) Fusion of contrastive acoustic models for parallel phonotactic spoken language identification. Proc. Interspeech 2007, 170-173, doi: 10.21437/Interspeech.2007-72

@inproceedings{sim07_interspeech,
  author={Khe Chai Sim and Haizhou Li},
  title={{Fusion of contrastive acoustic models for parallel phonotactic spoken language identification}},
  year=2007,
  booktitle={Proc. Interspeech 2007},
  pages={170--173},
  doi={10.21437/Interspeech.2007-72}
}