We describe a voice activity detection algorithm which leads to significant improvement of a narrow-bandwidth speaker verification system under harsh environments. This algorithm is based on a time-scale feature which is extracted from wavelet subbands. A statistical quantile filtering technique is proposed to estimate an adaptive noise threshold. A hang-over scheme is then applied to bridge short pauses between speech frames. This optimized voice activity detector is embedded in the front-end unit of the narrow-bandwidth speaker verification system. The proposed algorithm is evaluated by objective tests on band-pass filtered utterances from the TIMIT database which was artificially corrupted by different additive noise types. Furthermore, it is tested with band-pass filtered SPEECHDAT-AT and WSJ0 databases in terms of speaker verification rates. This algorithm shows its superiority in performance due to the robust time-scale feature and the adaptive threshold.
Bibliographic reference. Pham, Tuan Van / Neffe, Michael / Kubin, Gernot (2007): "Robust voice activity detection for narrow-bandwidth speaker verification under adverse environments", In INTERSPEECH-2007, 2037-2040.