10th Annual Conference of the International Speech Communication Association

Brighton, United Kingdom
September 6-10, 2009

Robust F0 Estimation Based on Log-Time Scale Autocorrelation and its Application to Mandarin Tone Recognition

Yusuke Kida, Masaru Sakai, Takashi Masuko, Akinori Kawamura

Toshiba Corporate R&D Center, Japan

This paper proposes a novel F0 estimation method in which delta-logF0 is directly estimated based on autocorrelation function (ACF) on a logarithmic time scale. Since peaks of ACFs of periodic signals have a specific pattern on the log-time scale and the period only affects the position of the pattern, delta-logF0 can be estimated directly from the shift of the peaks of the log-time scale ACF (LTACF) without F0 estimation. Then logF0 is estimated from the sum of LTACFs shifted based on delta-logF0. Experimental results show that the proposed method is more robust against noise than the baseline ACF-based method. It is also shown that the proposed method significantly improves the Mandarin tone recognition accuracy.

Full Paper

Bibliographic reference.  Kida, Yusuke / Sakai, Masaru / Masuko, Takashi / Kawamura, Akinori (2009): "Robust F0 estimation based on log-time scale autocorrelation and its application to Mandarin tone recognition", In INTERSPEECH-2009, 2971-2974.