Sixth European Conference on Speech Communication and Technology

Budapest, Hungary
September 5-9, 1999

The Fourth-Order Cumulant of Speech Signals with Application to Voice Activity Detection

Elias Nemer (1), Rafik Goubran (2), Samy Mahmoud (2)

(1) Nortel Networks, Verdun, Quebec, Canada
(2) Systems & Computer Eng’g, Carleton University, Ottawa, Ontario, Canada

This paper explores the fourth order cumulants (FOC) of the LPC residual of speech signals and presents a new algorithm for Voice Activity detection (VAD) based on the newly established FOC properties. Analytical expressions for the horizontal slice of the 4th cumulant as well as the kurtosis of voiced speech are derived based on a reported sinusoidal model [4]. The derivations demonstrate that the kurtosis of voiced speech is distinct from that of Gaussian noise and can be used to aid in detecting voicing. The proposed VAD combines FOC metrics with SNR measures to classify speech and noise frames. Its performance is compared to the ITU-T G.729B VAD [1] in various noise conditions, and quantified using the probability of correct and false classifications. The results show the proposed VAD has overall comparable performance to the G.729B: Its probability of false classification is lower in low SNR and Gaussian-like noise, but higher in speech-like noises.

Full Paper (PDF)   Gnu-Zipped Postscript

Bibliographic reference.  Nemer, Elias / Goubran, Rafik / Mahmoud, Samy (1999): "The fourth-order cumulant of speech signals with application to voice activity detection", In EUROSPEECH'99, 2391-2394.