A multipitch tracker for monaural speech segmentation

André Coy, Jon Barker

This paper presents a novel algorithm for forming coherent harmonic fragments from a mixture of speech sources. A multiple pitch detection algorithm is used to produce pitch candidates which are tracked using a pair of parallel HMMs. One novel aspect of the technique is that it systematically models pitch doubling and halving errors, thereby facilitating the identification of smooth pitch segments even in the absence of the fundamental frequency. The system does not face the problem of incorrect source assignment that can occur when sources have similar fundamental frequency or are harmonically related. An evaluation of the technique shows that the algorithm’s emphasis on tracking coherent segments leads to the formation of speech fragments with high coherence, indicating a more reliable segmentation of the harmonic speech regions.

doi: 10.21437/Interspeech.2006-467

Cite as: Coy, A., Barker, J. (2006) A multipitch tracker for monaural speech segmentation. Proc. Interspeech 2006, paper 1000-Wed1FoP.7, doi: 10.21437/Interspeech.2006-467

