Interspeech'2005 - Eurospeech
Accurate speech activity detection is a challenging problem in the car environment where high background noise and high amplitude transient sounds are common. We investigate a number of features that are designed for capturing the harmonic structure of speech. We evaluate separately three important characteristics of these features: 1) discriminative power 2) robustness to greatly varying SNR and channel characteristics and 3) performance when used in conjunction with MFCC features. We propose a new features, the Windowed Autocorrelation Lag Energy (WALE) which has desirable properties.
Bibliographic reference. Kristjansson, Trausti / Deligne, Sabine / Olsen, Peder (2005): "Voicing features for robust speech detection", In INTERSPEECH-2005, 369-372.