2001: A Speaker Odyssey - The Speaker Recognition Workshop
June 18-22, 2001
Automatic speaker recognition technology appears to have reached a sufficient level of maturity for realistic application in the field of forensic science. However, there are key issues to be solved before the forensic community will accept its use as an investigative assistant or as evidence in actual criminal cases. To assess the state of the technology, the Federal Bureau of Investigation (FBI) built a speech corpus that included multiple levels of increasing difficulty based on text-independence, channel-independence, speaking mode, and speech duration. An evaluation of multiple automatic speaker recognition programs indicated that a large GMM model-based recognition algorithm operating with features that are robust with respect to channel variations had the best performance. In this paper we describe (1) the results of evaluations of the recognition performance produced by multiple participating research organizations, (2) The FBI's initial Forensic Automatic Speaker Recognition (FASR) program based on these concepts, and (3) a confidence measurement method to indicate the probabilistic certainty level of correctness of each recognition decision. We will also discuss the need and justification for input speech screening and pre-processing to improve the recognition performance of the FASR as applied in a real forensic environment.
Full Paper Presentation
Bibliographic reference. Nakasone, Hirotaka / Beck, Steven D. (2001): "Forensic automatic speaker recognition", In ODYSSEY-2001, #.