This paper proposes a new approach to speech end-point detection based on cepstral analysis. The algorithm is based on explicit (static) modelling of speech and non-speech, and decisions are made on each incoming (overlapped) cepstral frame, according to model similarity scores. The cepstral analysis provides excellent level-independence, meaning that parameter adjustment, decision thresholds etc, are unnecessary. A high degree of robustness to additive noise is demonstrated, even though the models are static. Accurate end-points are recovered with SNR levels of 0dB.
Bibliographic reference. Haigh, J. A. / Mason, J. S. (1993): "A voice activity detector based on cepstral analysis", In EUROSPEECH'93, 1105-1106.