ISCA Archive Interspeech 2005
ISCA Archive Interspeech 2005

Speech event detection using multiband modulation energy

Georgios Evangelopoulos, Petros Maragos

The need for efficient, sophisticated features for speech event detection is inherent in state of the art processing, enhancement and recognition systems. We explore ideas and techniques from non-linear speech modeling and analysis, like modulations and multiband filtering and propose new energy and spectral content features derived through filtering in multiple frequency bands and tracking dominant modulation energy in terms of the Teager- Kaiser Energy of separate AM-FM components. We present a detection-theoretic motivation and incorporate them in two detection schemes namely word boundary and voice activity detection. The modulation approach demonstrated noisy speech endpoint detection accuracy, reaching ¡«40% error reduction on NTIMIT. In a voice activity scheme, improvement in overall misclassification error of a high hit-rate detector reached 7.5% on Aurora 2 and 9.5% on Aurora 3 databases.

doi: 10.21437/Interspeech.2005-197

Cite as: Evangelopoulos, G., Maragos, P. (2005) Speech event detection using multiband modulation energy. Proc. Interspeech 2005, 685-688, doi: 10.21437/Interspeech.2005-197

  author={Georgios Evangelopoulos and Petros Maragos},
  title={{Speech event detection using multiband modulation energy}},
  booktitle={Proc. Interspeech 2005},