8th Annual Conference of the International Speech Communication Association

Antwerp, Belgium
August 27-31, 2007

Robust and High-Resolution Voiced/Unvoiced Classification in Noisy Speech Using a Signal Smoothness Criterion

A. Sreenivasa Murthy, S. Chandra Sekhar, T. V. Sreenivas

Indian Institute of Science, India

We propose a novel technique for robust voiced/unvoiced segment detection in noisy speech, based on local polynomial regression. The local polynomial model is well-suited for voiced segments in speech. The unvoiced segments are noise-like and do not exhibit any smooth structure. This property of smoothness is used for devising a new metric called the variance ratio metric, which, after thresholding, indicates the voiced/unvoiced boundaries with 75% accuracy for 0dB global signal-to-noise ratio (SNR). A novelty of our algorithm is that it processes the signal continuously, sample-by-sample rather than frame-by-frame. Simulation results on TIMIT speech database (downsampled to 8kHz) for various SNRs are presented to illustrate the performance of the new algorithm. Results indicate that the algorithm is robust even in high noise levels.

Full Paper

Bibliographic reference.  Murthy, A. Sreenivasa / Sekhar, S. Chandra / Sreenivas, T. V. (2007): "Robust and high-resolution voiced/unvoiced classification in noisy speech using a signal smoothness criterion", In INTERSPEECH-2007, 2965-2968.