Automatic language identification (LID) systems generally exploit acoustic knowledge, possibly enriched by explicit language specific phonotactic or lexical constraints. This paper investigates a new LID approach based on hierarchical multilayer perceptron (MLP) classifiers, where the first layer is a ``universal phoneme set MLP classifier''. The resulting (multilingual) phoneme posterior sequence is fed into a second MLP taking a larger temporal context into account. The second MLP can learn/exploit implicitly different types of patterns/information such as confusion between phonemes and/or phonotactics for LID. We investigate the viability of the proposed approach by comparing it against 2 standard approaches which use phonotactic and lexical constraints with the universal phoneme set MLP classifier as emission probability estimator. On SpeechDat(II) datasets of 5 European languages, the proposed approach yields significantly better performance compared to the 2 standard approaches.
Bibliographic reference. Imseng, David / Doss, Mathew Magimai / Bourlard, Hervé (2010): "Hierarchical multilayer perceptron based language identification", In INTERSPEECH-2010, 2722-2725.