ISCA Archive Interspeech 2008
ISCA Archive Interspeech 2008

Robustness of prosodic features to voice imitation

Mireia Farrús, Michael Wagner, Jan Anguita, Javier Hernando

Prosody plays an important role in the human recognition process; therefore, prosodic elements are normally used by impersonators aiming to resemble someone else. Since such voice imitation is one of the potential threats to security systems relying on automatic speaker recognition, and prosodic features have been considered for state-of-the-art recognition systems in recent years, the question arises as to what extent a mimicker is able to get close the prosodic characteristics of a target speaker. To this end, two experiments are conducted for twelve individual features in order to determine how a prosodic speaker identification system would perform against professionally imitated voices. The results show that the identification error rate increases for all the features except F0 range when the impersonators' modified voices are used instead of the impersonators natural voices. Moreover, it seems easier to copy prosody on the basis of a whole sentence than for a specific word.

doi: 10.21437/Interspeech.2008-196

Cite as: Farrús, M., Wagner, M., Anguita, J., Hernando, J. (2008) Robustness of prosodic features to voice imitation. Proc. Interspeech 2008, 613-616, doi: 10.21437/Interspeech.2008-196

  author={Mireia Farrús and Michael Wagner and Jan Anguita and Javier Hernando},
  title={{Robustness of prosodic features to voice imitation}},
  booktitle={Proc. Interspeech 2008},