15th Annual Conference of the International Speech Communication Association

September 14-18, 2014

Speech Detection in Transient Noises

G. Aneeja, B. Yegnanarayana

IIIT Hyderabad, India

Voice activity detection (VAD) uses a representation of speech derived from spectrum analysis, followed by statistical characterization of speech and degrading noise. Features derived using traditional methods may not be adequate for VAD in the case of transient noises. In this paper, we focus on transient noises where most of the VAD systems in literature do not perform well. A high temporal resolution and high frequency resolution representation is used to discriminate the transient noises from speech.
   The high temporal and frequency resolution representation is achieved by filtering the signal at several single frequencies. The single frequency filtering approach helps to isolate the regions of transient noise in a signal. A time varying threshold is proposed based on the spectral variance and the temporal variance of the speech signal to detect transient noise. The remaining regions are processed by the spectral variance measure for VAD. The results have been compared to the Adaptive Multi-rate (AMR) methods. The performance of proposed method is consistently better due to the instantaneous feature. The percentage of detection of transient noise is higher for the proposed method than the methods reported in the literature.

Full Paper

Bibliographic reference.  Aneeja, G. / Yegnanarayana, B. (2014): "Speech detection in transient noises", In INTERSPEECH-2014, 2356-2360.