ISCA Archive Interspeech 2013
ISCA Archive Interspeech 2013

Multi-band long-term signal variability features for robust voice activity detection

Andreas Tsiartas, Theodora Chaspari, Nassos Katsamanis, Prasanta Kumar Ghosh, Ming Li, Maarten Van Segbroeck, Alexandros Potamianos, Shrikanth Narayanan

In this paper, we propose robust features for the problem of voice activity detection (VAD). In particular, we extend the long term signal variability (LTSV) feature to accommodate multiple spectral bands. The motivation of the multiband approach stems from the non-uniform frequency scale of speech phonemes and noise characteristics. Our analysis shows that the multi-band approach offers advantages over the single band LTSV for voice activity detection. In terms of classification accuracy, we show 0.3%.61.2% relative improvement over the best accuracy of the baselines considered for 7 out 8 different noisy channels. Experimental results, and error analysis, are reported on the DARPA RATS corpora of noisy speech.


doi: 10.21437/Interspeech.2013-201

Cite as: Tsiartas, A., Chaspari, T., Katsamanis, N., Ghosh, P.K., Li, M., Segbroeck, M.V., Potamianos, A., Narayanan, S. (2013) Multi-band long-term signal variability features for robust voice activity detection. Proc. Interspeech 2013, 718-722, doi: 10.21437/Interspeech.2013-201

@inproceedings{tsiartas13_interspeech,
  author={Andreas Tsiartas and Theodora Chaspari and Nassos Katsamanis and Prasanta Kumar Ghosh and Ming Li and Maarten Van Segbroeck and Alexandros Potamianos and Shrikanth Narayanan},
  title={{Multi-band long-term signal variability features for robust voice activity detection}},
  year=2013,
  booktitle={Proc. Interspeech 2013},
  pages={718--722},
  doi={10.21437/Interspeech.2013-201}
}