A novel Statistical Approach for F0 Estimation, SAFE, is proposed to improve the accuracy of F0 tracking under both clean and additive noise conditions. Prominent Signal-to-Noise Ratio (SNR) peaks in speech spectra are robust information source from which F0 can be inferred. A probabilistic framework is proposed to model the effect of additive noise on voiced speech spectra. It is observed that prominent SNR peaks located in the low frequency band are important to F0 estimation, and prominent SNR peaks in the middle and high frequency bands are also useful supplemental information to F0 estimation under noisy conditions, especially babble noise condition. Experiments show that the SAFE algorithm has the lowest Gross Pitch Errors (GPE) compared to prevailing F0 trackers: Get_F0, Praat, TEMPO, and YIN, in white and babble noise conditions at low SNRs.
Bibliographic reference. Chu, Wei / Alwan, Abeer (2010): "SAFE: a statistical algorithm for F0 estimation for both clean and noisy speech", In INTERSPEECH-2010, 2590-2593.