Odyssey 2008: The Speaker and Language Recognition Workshop

Stellenbosch, South Africa
January 21-24, 2008

How Vulnerable are Prosodic Features to Professional Imitators?

Mireia Farrús (1,2), Michael Wagner (1), Jan Anguita (1,2), Javier Hernando (2)

(1) National Centre for Biometric Studies, School of Information Sciences and Engineering, University of Canberra, Australia
(2) Research Centre, Department of Signal Theory and Communications, Technical University of Catalonia (UPC), Barcelona, Catalonia, Spain

Voice imitation is one of the potential threats to security systems that use automatic speaker recognition. Since prosodic features have been considered for state-of-the-art recognition systems in recent years, the question arises as to how vulnerable these features are to voice mimicking. In this study, two experiments are conducted for twelve individual features in order to determine how a prosodic speaker identification system would perform against professionally imitated voices. By analysing prosodic parameters, the results show that the identification error rate increases for most of the features, except for the range of the fundamental frequency, which seems to be relatively robust against voice mimicking. When all twelve features are fused, the identification error rate increases from 5% between the target voices and the imitators’ natural voices to 22% between the target voices and the imitators’ impersonations.

Full Paper

Bibliographic reference.  Farrús, Mireia / Wagner, Michael / Anguita, Jan / Hernando, Javier (2008): "How vulnerable are prosodic features to professional imitators?", In Odyssey-2008, paper 002.