ISCA Archive Interspeech 2008
ISCA Archive Interspeech 2008

Evaluation of voice activity and voicing detection

Bojan Kotnik, Pierre Sendorek, Sergey Astrov, Turgay Koc, Tolga Ciloglu, Laura Docío Fernández, Eduardo Rodríguez Banga, Harald Höge, Zdravko Kačič

This paper describes the ECESS evaluation campaign of voice activity and voicing detection. Standard VAD classifies signal into speech and non-speech, we extend it to VAD+ so that it classifies a signal as a sequence of non-speech, voiced and unvoiced segments. The evaluation is performed on a portion of the Spanish SPEECON database with manually labeled segmentation. To avoid errors caused by the limited precision of manual labeling we introduce "dead zones" - tolerance intervals +-5 ms around label changes in the data set. In these tolerance intervals we don't evaluate the signal.

doi: 10.21437/Interspeech.2008-456

Cite as: Kotnik, B., Sendorek, P., Astrov, S., Koc, T., Ciloglu, T., Fernández, L.D., Banga, E.R., Höge, H., Kačič, Z. (2008) Evaluation of voice activity and voicing detection. Proc. Interspeech 2008, 1642-1645, doi: 10.21437/Interspeech.2008-456

  author={Bojan Kotnik and Pierre Sendorek and Sergey Astrov and Turgay Koc and Tolga Ciloglu and Laura Docío Fernández and Eduardo Rodríguez Banga and Harald Höge and Zdravko Kačič},
  title={{Evaluation of voice activity and voicing detection}},
  booktitle={Proc. Interspeech 2008},