This paper presents initial experiments in language identification for Spanish and Basque, which are both official languages in the Basque Country in the North of Spain. We focus on three methods based on Hidden Markov Models (HMMs): parallel phone decoding, with no phonotactic knowledge, phone decoder scored by a phonotactic model and single phone decoder scored by a phonotactic model, with phonotactic knowledge. Results for the three techniques are presented, along with others obtained using a neural network classifier. Significant accuracy is achieved when better phonotactic knowledge is used. The use of a neural network classifier results in a slightly improvement and, in overall, similar results are achieved for both languages, with accuracies around 98%.
Cite as: Guijarrubia, V.G., Torres, M.I. (2006) Basque-Spanish language identification using phone-based methods. Proc. Interspeech 2006, paper 1892-Mon2CaP.9, doi: 10.21437/Interspeech.2006-140
@inproceedings{guijarrubia06_interspeech, author={Victor G. Guijarrubia and M. Ines Torres}, title={{Basque-Spanish language identification using phone-based methods}}, year=2006, booktitle={Proc. Interspeech 2006}, pages={paper 1892-Mon2CaP.9}, doi={10.21437/Interspeech.2006-140} }