Odyssey 2008: The Speaker and Language Recognition Workshop
Stellenbosch, South Africa
Open set speaker identification consists of deciding whether an input utterance corresponds to a target speaker or to an impostor. The most likely among a set of target speakers is hypothesized and verified. Speaker verification is performed by comparing the likelihood score of the most likely speaker model to the likelihood score of an impostor model, and then applying a suitable threshold. The most common approach to modelling impostors is the Universal Background Model (UBM). For the UBM to be effective, it must be estimated from a large number of speakers. However, it is not always possible to gather enough data to estimate a robust UBM, and the verification performance may degrade if impostors, or whatever sources that generate the input signals, were not suitably modelled by the UBM. In this paper, a simple approach is proposed which estimates a shallow source model (SSM) based on the input utterance, and then uses this SSM to normalize the speaker score. Though the SSM does not outperform the UBM, the combination of both models improves the recognition performance and drastically increases the robustness to signals not covered by the UBM.
Full Paper Presentation (PDF)
Bibliographic reference. Zamalloa, Maider / Rodriguez, Luis Javier / Penagarikano, Mikel / Bordel, Germán / Uribe, Juan Pedro (2008): "Improving robustness in open set speaker identification by shallow source modeling", In Odyssey-2008, paper 007.