14thAnnual Conference of the International Speech Communication Association

Lyon, France
August 25-29, 2013

Improving the Accuracy and the Robustness of Harmonic Model for Pitch Estimation

Meysam Asgari, Izhak Shafran

Oregon Health & Science University, USA

Accurate and robust estimation of pitch plays a central role in speech processing. Various methods in time, frequency and cepstral domain have been proposed for generating pitch candidates. Most algorithms excel when the background noise is minimal or for specific types of background noise. In this work, our aim is to improve the robustness and accuracy of pitch estimation across a wide variety of background noise conditions. For this we have chosen to adopt, the harmonic model of speech, a model that has gained considerable attention recently. We address two major weakness of this model. The problem of pitch halving and doubling, and the need to specify the number of harmonics. We exploit the energy of frequency in the neighborhood to alleviate halving and doubling. Using a model complexity term with a BIC criterion, we chose the optimal number of harmonics. We evaluated our proposed pitch estimation method with other state of the art techniques on Keele data set in terms of gross pitch error and fine pitch error. Through extensive experiments on several noisy conditions, we demonstrate that the proposed improvements provide substantial gains over other popular methods under different noise levels and environments.

Full Paper

Bibliographic reference.  Asgari, Meysam / Shafran, Izhak (2013): "Improving the accuracy and the robustness of harmonic model for pitch estimation", In INTERSPEECH-2013, 1936-1940.