EUROSPEECH 2003 - INTERSPEECH 2003
A methodology is presented for fundamental frequency estimation of one or more voices. The signal is modeled as the sum of one or more periodic signals, and the parameters estimated by search with interpolation. Accurate, reliable estimates are obtained for each frame without tracking or continuity constraints, and without the use of specific instrument models (although their use might further boost performance). In formal evaluation over a large database of speech, the single-voice algorithm outperformed the best competing methods by a factor of three.
Bibliographic reference. Cheveigne, Alain de / Baskind, Alexis (2003): "F_0 estimation of one or several voices", In EUROSPEECH-2003, 833-836.