ISCA Archive SpeechProsody 2008
ISCA Archive SpeechProsody 2008

Voice stress extraction

Grazyna Demenko

The aim of the research was to assess the possibility of voice stress extraction and classification. It was assumed that the study’s results could be applied in call centers and could be useful for securi security services. The authentic Poznan police database with the recordings of the 997 emergency phone calls was used for analysis. Out of 60 000 recordings collected in the database, 20 000 were automatically selected, a few hundred of which were eventually chosen for acoustic evaluation, the basis for that selection being a perceptual assessment. The MDVP analysis confirmed statistical significance of such parameters as fundamental frequency, energy and pitch variations for stress categorization. Some segmenta segmental parameters such as tremor and noise parameters were also confirmed to be of some importance. In case of highly stressful conditions a systematic over over-one one- octave shift in pitch was observed. It was concluded that the range of F0 per se does not seem to correlate with stress whereas the shift in F0 register constitutes the primary indicator of stress. Linear Discriminant Analysis based on 12 acoustic features showed it is possible to categorize the following classes: neutral, depressive, stressed, highly stressed speech.


Cite as: Demenko, G. (2008) Voice stress extraction. Proc. Speech Prosody 2008, 53-56

@inproceedings{demenko08_speechprosody,
  author={Grazyna Demenko},
  title={{Voice stress extraction}},
  year=2008,
  booktitle={Proc. Speech Prosody 2008},
  pages={53--56}
}