Sixth European Conference on Speech Communication and Technology
Knowledge of the distribution of rare segments across the languages of the world might be used in identifying languages within an open set. Segments which are both discriminatory (i.e. rare) and robust (i.e. easy to identify) are the best targets for efficient language identification. Considering several properties at the same time allows to use more common segments and/or features in a still very discriminatory way.
Full Paper (PDF)
Bibliographic reference. Hombert, Jean-Marie / Maddieson, Ian (1999): "The use of 'rare' segments for language identification", In EUROSPEECH'99, 379-382.