This study examines the difficult task of Speech Activity Detection (SAD) in two hostile environments: AM push-to-talk air traffic control and international telephone conversations with very low SNRs. Due to the poor performance of traditional energy-based SAD, two novel approaches to SAD were developed that specifically target spectral characteristics that typify speech, rather than trying to separate out the background, which can vary enormously. As a result these approaches are inherently adaptive to their environments. A Speech Energy Resonance Band Detection approach and a Harmonic Product Spectrum clustering approach to SAD are described in this paper and their performance evaluated against MIT Xtalk and the Teager Energy Operator (TEO) in clean and hostile environments.
Bibliographic reference. Huggins, Mark / Smolenski, Brett / Lawson, Aaron (2010): "Adaptive high accuracy approaches to speech activity detection in noisy and hostile audio environments", In INTERSPEECH-2010, 3094-3097.