11th Annual Conference of the International Speech Communication Association

Makuhari, Chiba, Japan
September 26-30. 2010

Voice Activity Detection in a Reguarized Reproducing Kernel Hilbert Space

Xugang Lu (1), Masashi Unoki (2), Ryosuke Isotani (1), Hisashi Kawai (1), Satoshi Nakamura (1)

(1) NICT, Japan
(2) JAIST, Japan

Traditional Voice activity detection (VAD) algorithms are applied in a linear transformed space without any constraint. As a result, the VAD algorithms are not robust to noise interference. Considering the special characteristics of speech, we proposed a new speech feature extraction method by giving constraints on the processing space as a reproducing kernel Hilbert space (RKHS). In the RKHS, we regarded the speech estimation as a functional approximation problem. Under this framework, we could incorporate the nonlinear mapping functions in the approximation implicitly via a kernel function. The approximation function could capture the nonlinear and high-order statistical regularities of the speech. Our VAD algorithm is designed on the basis of the power energy in this regularized RKHS. Compared with a baseline and G.729B VAD algorithms, experimental results showed the promising advantages of our proposed algorithm.

Full Paper

Bibliographic reference.  Lu, Xugang / Unoki, Masashi / Isotani, Ryosuke / Kawai, Hisashi / Nakamura, Satoshi (2010): "Voice activity detection in a reguarized reproducing kernel hilbert space", In INTERSPEECH-2010, 3086-3089.