13th Annual Conference of the International Speech Communication Association

Portland, OR, USA
September 9-13, 2012

Robust Pitch Estimation Using l1-regularized Maximum Likelihood Estimation

Feng Huang, Tan Lee

Department of Electronic Engineering, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong, China

This paper presents a new method of robust pitch estimation using sparsity-based estimation techniques. The method is developed based on sparse representation of a temporal-spectral pitch feature. The robust pitch feature is obtained by accumulating spectral peaks over consecutive frames. It is expressed as a sparse linear combination of an over-complete set of peak spectrum exemplars. The probability distribution of the noise is assumed to be Gaussian with non-zero mean. The weights of the linear combination are estimated by maximizing the likelihood of the feature under sparsity constraint. The sparsity constraint is incorporated as an l1 regularization term. From the estimated weights, the major constituent exemplars are identified and the fundamental frequency is determined. Experimental results show that, with this method, pitch estimation accuracy is significantly improved, particularly at low signal-to-noise ratios.

Index Terms: Robust pitch estimation, speech sparsity, l1 regularization, peak spectrum

Full Paper

Bibliographic reference.  Huang, Feng / Lee, Tan (2012): "Robust pitch estimation using l1-regularized maximum likelihood estimation", In INTERSPEECH-2012, 378-381.