This paper presents results on using rhythm for automatic language identification (LID). The idea is to explore the duration of pseudo-syllables as language discriminative feature. The resulting Rhythm system is based on Bigram duration models of neighbouring pseudo-syllables. The Rhythm system is fused with a Spectral system realized by parallel Phoneme Recognition (PPR) approach using MFCC's. The LID systems were evaluated on a 7 languages identification task using the Speech- Dat II databases. Tests were performed with 7 seconds utterances. Whereas the Spectral system acting as a baseline system achieved an error rate of 7.9% the fused system reduced the error rate by 10% relatively.
Cite as: Timoshenko, E., Höge, H. (2007) Using speech rhythm for acoustic language identification. Proc. Interspeech 2007, 182-185, doi: 10.21437/Interspeech.2007-75
@inproceedings{timoshenko07_interspeech, author={Ekaterina Timoshenko and Harald Höge}, title={{Using speech rhythm for acoustic language identification}}, year=2007, booktitle={Proc. Interspeech 2007}, pages={182--185}, doi={10.21437/Interspeech.2007-75} }