This paper describes the ECESS evaluation campaign of voice activity and voicing detection. Standard VAD classifies signal into speech and non-speech, we extend it to VAD+ so that it classifies a signal as a sequence of non-speech, voiced and unvoiced segments. The evaluation is performed on a portion of the Spanish SPEECON database with manually labeled segmentation. To avoid errors caused by the limited precision of manual labeling we introduce "dead zones" - tolerance intervals +-5 ms around label changes in the data set. In these tolerance intervals we don't evaluate the signal.
Bibliographic reference. Kotnik, Bojan / Sendorek, Pierre / Astrov, Sergey / Koc, Turgay / Ciloglu, Tolga / Fernández, Laura Docío / Banga, Eduardo Rodríguez / Höge, Harald / Kačič, Zdravko (2008): "Evaluation of voice activity and voicing detection", In INTERSPEECH-2008, 1642-1645.