INTERSPEECH 2007
8th Annual Conference of the International Speech Communication Association

Antwerp, Belgium
August 27-31, 2007

A Comparative Study of Speech Rate Estimation Techniques

Tomas Dekens (1), Mike Demol (1), Werner Verhelst (1), Piet Verhoeve (2)

(1) Vrije Universiteit Brussel, Belgium
(2) Televic, Belgium

In this paper we evaluate the performance of 8 different speech rate estimators [1, 2, 3, 4, 5] previously described in the literature by applying them on a multilingual test database [6]. All the estimators show an underestimation at high speech rates and some also suffer from an overestimation at low speech rates. Overall the tested methods obtain high correlation coefficients with the reference speech rate. The Temporal Correlation and Selected Sub-band Correlation method (tcssbc), which uses sub-band and time domain correlation for detecting the number of vowels or diphthongs present in the speech signal, shows little errors and appears to be the most appropriate overall technique for speech rate estimation.

Full Paper

Bibliographic reference.  Dekens, Tomas / Demol, Mike / Verhelst, Werner / Verhoeve, Piet (2007): "A comparative study of speech rate estimation techniques", In INTERSPEECH-2007, 510-513.