ISCA Archive Interspeech 2008
ISCA Archive Interspeech 2008

Detection of speech embedded in real acoustic background based on amplitude modulation spectrogram features

Jörn Anemüller, Denny Schmidt, Jörg-Hendrik Bach

A classification method is presented that detects the presence of speech embedded in a real acoustic background of non-speech sounds. Features used for classification are modulation components extracted by computation of the amplitude modulation spectrogram. Feature selection techniques and support vector classification are employed to identify modulation components most salient for the classification task and therefore considered as highly characteristic for speech. Results show that reliable detection of speech can be performed with less than 10 optimally selected modulation features, the most important ones are located in the modulation frequency range below 10 Hz. Detection of speech in a background of non-speech signals is performed with about 90% test-data accuracy at a signal-to-noise level of 0 dB. Compared to standard ITU G729.B voice activity detection, the proposed method results in increased true positive and reduced false positive rates induced by a real acoustic background.


doi: 10.21437/Interspeech.2008-640

Cite as: Anemüller, J., Schmidt, D., Bach, J.-H. (2008) Detection of speech embedded in real acoustic background based on amplitude modulation spectrogram features. Proc. Interspeech 2008, 2582-2585, doi: 10.21437/Interspeech.2008-640

@inproceedings{anemuller08_interspeech,
  author={Jörn Anemüller and Denny Schmidt and Jörg-Hendrik Bach},
  title={{Detection of speech embedded in real acoustic background based on amplitude modulation spectrogram features}},
  year=2008,
  booktitle={Proc. Interspeech 2008},
  pages={2582--2585},
  doi={10.21437/Interspeech.2008-640}
}