The 1999 NIST Speaker Recognition Evaluation encompassed three tasks: one-speaker detection, two-speaker detection, and speaker tracking. All tasks were performed in the context of conversational telephone speech. The one-speaker task used single channel mu-law data; the other tasks used summed two-channel data. Twelve sites from the United States, Europe, and India participated in the evaluation. Performance was measured by a decision cost function and compared among systems and test conditions via DET Curves. Performance factors examined include segment duration, degradation resulting from the presence of a second speaker, sex mix of the two-speaker segments, matched or mismatched between training and test handsets, and the variation in handset type.
Cite as: Przybocki, M.A., Martin, A.F. (1999) The 1999 NIST speaker recognition evaluation, using summed two-channel telephone data for speaker detection and speaker tracking. Proc. 6th European Conference on Speech Communication and Technology (Eurospeech 1999), 2215-2218, doi: 10.21437/Eurospeech.1999-491
@inproceedings{przybocki99_eurospeech, author={Mark A. Przybocki and Alvin F. Martin}, title={{The 1999 NIST speaker recognition evaluation, using summed two-channel telephone data for speaker detection and speaker tracking}}, year=1999, booktitle={Proc. 6th European Conference on Speech Communication and Technology (Eurospeech 1999)}, pages={2215--2218}, doi={10.21437/Eurospeech.1999-491} }