In this paper we evaluate the performance of 8 different speech rate estimators [1, 2, 3, 4, 5] previously described in the literature by applying them on a multilingual test database . All the estimators show an underestimation at high speech rates and some also suffer from an overestimation at low speech rates. Overall the tested methods obtain high correlation coefficients with the reference speech rate. The Temporal Correlation and Selected Sub-band Correlation method (tcssbc), which uses sub-band and time domain correlation for detecting the number of vowels or diphthongs present in the speech signal, shows little errors and appears to be the most appropriate overall technique for speech rate estimation.
Bibliographic reference. Dekens, Tomas / Demol, Mike / Verhelst, Werner / Verhoeve, Piet (2007): "A comparative study of speech rate estimation techniques", In INTERSPEECH-2007, 510-513.