![]() |
ODYSSEY 2004 - The Speaker and Language Recognition WorkshopMay 31 - June 3, 2004 |
![]() |
We present an approach to speaker recognition in the textindependent domain of conversational telephone speech using a text-constrained system designed to employ select highfrequency keywords in the speech stream. The system uses speaker word models generated via Hidden Markov Models (HMMs) - a departure from the traditional Gaussian Mixture Model (GMM) approach dominant in text-independent work, but commonly employed in text-dependent systems - with the expectation that HMMs take greater advantage of sequential information and support more detailed modeling which could be used to aid recognition. Even with a keyword inventory that covers a mere 10% of the word tokens and a system that does not yet incorporate many standard speaker recognition normalization schemes, this approach is already achieving equal error rates of 1% on NIST’s 2001 Extended Data task.
Bibliographic reference. Boakye, Kofi / Peskin, Barbara (2004): "Text-constrained speaker recognition on a text-independent task", In ODYS-2004, 129-134.