ISCA Archive ICSLP 2000
ISCA Archive ICSLP 2000

Can automatic speaker verification be improved by training the algorithms on emotional speech?

Klaus R. Scherer, Tom Johnstone, Gudrun Klasmeyer, Thomas Bänziger

The ongoing work described in this contribution attempts to demonstrate the need to train ASV algorithms on emotional speech, in addition to neutral speech, in order to achieve more robust results in real life verification situations. A computerized induction program with 6 different tasks, producing different types of stressful or emotional speaker states, was developed, pretested, and used to record French, German, and English speaking participants. For a subset of these speakers, physiological data were obtained to determine the degree of physiological arousal produced by the emotion inductions and to determine the correlation between physiological responses and voice production as revealed in acoustic parameters. In collaboration with a commercial ASV provider (Ensigma Ltd.), a standard verification procedure was applied to this speech material. This paper reports the first set of preliminary analyses for the subset of 30 German speakers. It is concluded that an evaluation of the promise of training ASV material on emotional speech requires in-depth analyses of the individual differences in vocal reactivity and further exploration of the link between acoustic changes under stress or emotion and verification results.


doi: 10.21437/ICSLP.2000-392

Cite as: Scherer, K.R., Johnstone, T., Klasmeyer, G., Bänziger, T. (2000) Can automatic speaker verification be improved by training the algorithms on emotional speech? Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000), vol. 2, 807-810, doi: 10.21437/ICSLP.2000-392

@inproceedings{scherer00b_icslp,
  author={Klaus R. Scherer and Tom Johnstone and Gudrun Klasmeyer and Thomas Bänziger},
  title={{Can automatic speaker verification be improved by training the algorithms on emotional speech?}},
  year=2000,
  booktitle={Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000)},
  pages={vol. 2, 807-810},
  doi={10.21437/ICSLP.2000-392}
}