2001: A Speaker Odyssey - The Speaker Recognition Workshop

June 18-22, 2001
Crete, Greece

Forensic Automatic Speaker Recognition

Hirotaka Nakasone (1), Steven D. Beck (2)

Federal Bureau of Investigation; Quantico, Virginia; USA
(2) BAE Systems, Austin, TX, USA

Automatic speaker recognition technology appears to have reached a sufficient level of maturity for realistic application in the field of forensic science. However, there are key issues to be solved before the forensic community will accept its use as an investigative assistant or as evidence in actual criminal cases. To assess the state of the technology, the Federal Bureau of Investigation (FBI) built a speech corpus that included multiple levels of increasing difficulty based on text-independence, channel-independence, speaking mode, and speech duration. An evaluation of multiple automatic speaker recognition programs indicated that a large GMM model-based recognition algorithm operating with features that are robust with respect to channel variations had the best performance. In this paper we describe (1) the results of evaluations of the recognition performance produced by multiple participating research organizations, (2) The FBI's initial Forensic Automatic Speaker Recognition (FASR) program based on these concepts, and (3) a confidence measurement method to indicate the probabilistic certainty level of correctness of each recognition decision. We will also discuss the need and justification for input speech screening and pre-processing to improve the recognition performance of the FASR as applied in a real forensic environment.


Full Paper   Presentation

Bibliographic reference.  Nakasone, Hirotaka / Beck, Steven D. (2001): "Forensic automatic speaker recognition", In ODYSSEY-2001, #.