In this paper we propose a model-based approach to instantaneous pitch estimation in noisy speech, by way of incorporating pitch smoothness assumptions into the well-known harmonic model. In this approach, the latent pitch contour is modeled using a basis of smooth polynomials, and is fit to waveform data by way of a harmonic model whose partials have time-varying amplitudes. The resultant nonlinear least squares estimation task is accomplished through the Gauss-Newton method with a novel initialization step that serves to greatly increase algorithm efficiency. We demonstrate the accuracy and robustness of our method through comparisons to state-of-the art pitch estimation algorithms using both simulated and real waveform data.
Bibliographic reference. Hong, Jung Ook / Wolfe, Patrick J. (2009): "Model-based estimation of instantaneous pitch in noisy speech", In INTERSPEECH-2009, 112-115.