Odyssey 2008: The Speaker and Language Recognition Workshop
Stellenbosch, South Africa
Language Identification (LID) of speech can be split into two processes; phone recognition and language modelling. This two stage approach underlies some of the most successful LID systems. As phone recognizers become more accurate it is useful to simulate a very accurate phone recognizer to determine the effect on the overall LID accuracy. This can be done by using phone transcripts. In this paper LID is performed on phone transcripts from six different languages in the OGI multi-language telephone speech corpus. By simulating a phone recognizer that classifies phones into ten broad classes, a simple n-gram model gives low LID equal error rates (EER) of <1% on 30 seconds of test data. Language models based on these accurate phone transcripts can reveal insights into the phonology of different languages.
Bibliographic reference. Kempton, Timothy / Moore, Roger K. (2008): "Language identification: insights from the classification of hand annotated phone transcripts", In Odyssey-2008, paper 014.