ISCA Archive ISCSLP 2002
ISCA Archive ISCSLP 2002

A voice activity detection algorithm based on perceptual wavelet packet transform and teager energy operator

Jhing-Fa Wang, Shi-Huang Chen

This paper presents a new voice activity detection (VAD) algorithm based on the perceptual wavelet packet transform (PWPT) and the Teager energy operator (TEO). The basic procedure of the proposed VAD algorithm is to make use of the PWPT to decompose the input speech into critical subband signals. Then a parameter called voice activity shape (VAS) can be derived from the TEO of these critical subband signals. It is shown in this paper that the VAS can be used as a robust feature for VAD. The advantage of this new algorithm is that the preset threshold values or a priori knowledge of the SNR usually needed in conventional VAD methods can be completely avoided. Various experimental results show that the proposed VAD algorithm is capable of outperforming to the ITU-T G.729B VAD and can operate reliably in real noisy environments.


Cite as: Wang, J.-F., Chen, S.-H. (2002) A voice activity detection algorithm based on perceptual wavelet packet transform and teager energy operator. Proc. International Symposium on Chinese Spoken Language Processing, paper 125

@inproceedings{wang02n_iscslp,
  author={Jhing-Fa Wang and Shi-Huang Chen},
  title={{A voice activity detection algorithm based on perceptual wavelet packet transform and teager energy operator}},
  year=2002,
  booktitle={Proc. International Symposium on Chinese Spoken Language Processing},
  pages={paper 125}
}