Interspeech'2005 - Eurospeech

Lisbon, Portugal
September 4-8, 2005

Voicing Features for Robust Speech Detection

Trausti Kristjansson, Sabine Deligne, Peder Olsen

IBM T.J. Watson Research Center, Yorktown Heights, NY, USA

Accurate speech activity detection is a challenging problem in the car environment where high background noise and high amplitude transient sounds are common. We investigate a number of features that are designed for capturing the harmonic structure of speech. We evaluate separately three important characteristics of these features: 1) discriminative power 2) robustness to greatly varying SNR and channel characteristics and 3) performance when used in conjunction with MFCC features. We propose a new features, the Windowed Autocorrelation Lag Energy (WALE) which has desirable properties.

Full Paper

Bibliographic reference.  Kristjansson, Trausti / Deligne, Sabine / Olsen, Peder (2005): "Voicing features for robust speech detection", In INTERSPEECH-2005, 369-372.