Interspeech'2005 - Eurospeech

Lisbon, Portugal
September 4-8, 2005

Speech Event Detection Using Multiband Modulation Energy

Georgios Evangelopoulos, Petros Maragos

National Technical University of Athens, Greece

The need for efficient, sophisticated features for speech event detection is inherent in state of the art processing, enhancement and recognition systems. We explore ideas and techniques from non-linear speech modeling and analysis, like modulations and multiband filtering and propose new energy and spectral content features derived through filtering in multiple frequency bands and tracking dominant modulation energy in terms of the Teager- Kaiser Energy of separate AM-FM components. We present a detection-theoretic motivation and incorporate them in two detection schemes namely word boundary and voice activity detection. The modulation approach demonstrated noisy speech endpoint detection accuracy, reaching กซ40% error reduction on NTIMIT. In a voice activity scheme, improvement in overall misclassification error of a high hit-rate detector reached 7.5% on Aurora 2 and 9.5% on Aurora 3 databases.

Full Paper

Bibliographic reference.  Evangelopoulos, Georgios / Maragos, Petros (2005): "Speech event detection using multiband modulation energy", In INTERSPEECH-2005, 685-688.