15th Annual Conference of the International Speech Communication Association

September 14-18, 2014

Dictionary-Based Pitch Tracking with Dynamic Programming

Ewout van den Berg, Bhuvana Ramabhadran

IBM T.J. Watson Research Center, USA

Pitch detection has important applications in areas of automatic speech recognition such as prosody detection, tonal language transcription, and general feature augmentation. In this paper we describe Pitcher, a new pitch tracking algorithm that correlates spectral information with a dictionary of waveforms each of which is designed to match signals with a given pitch value. We apply dynamic programming techniques on the resulting coefficient matrix to extract a smooth pitch contour while facilitating pitch halving and doubling transitions. We discuss the design of pitch atoms along with the various considerations for the pitch extraction process. We evaluate the performance of Pitcher on the PTDB database and compare its performance with three existing pitch tracking algorithms: YIN, IRAPT, and Swipe'. The performance of Pitcher consistently outperforms the other methods for low-pitched speakers and is comparable in performance to the best of the other three methods for high-pitched speakers.

Full Paper

Bibliographic reference.  Berg, Ewout van den / Ramabhadran, Bhuvana (2014): "Dictionary-based pitch tracking with dynamic programming", In INTERSPEECH-2014, 1347-1351.