INTERSPEECH 2011
12th Annual Conference of the International Speech Communication Association

Florence, Italy
August 27-31. 2011

Evaluation of Glottal Epoch Detection Algorithms on Different Voice Types

João P. Cabral (1), John Kane (2), Christer Gobl (2), Julie Carson-Berndsen (1)

(1) University College Dublin, Ireland
(2) Trinity College Dublin, Ireland

According to the source-filter model of speech production, speech can be represented by passing the excitation signal through the vocal tract filter. The epoch or instant of maximum excitation corresponds to the glottal closure instant. Several speech processing applications require robust epoch detection but this can be a difficult task. Although state-of-the-art epoch estimation methods can produce reliable results, they are generally evaluated using speech recorded with a neutral voice quality (modal voice). This paper reviews and evaluates six popular algorithms for the calculation of glottal closure instants on speech spoken with modal voice and seven additional voice qualities. Results show that the performance of each method is affected by the voice type and that some methods perform better than others for each voice quality.

Full Paper

Bibliographic reference.  Cabral, João P. / Kane, John / Gobl, Christer / Carson-Berndsen, Julie (2011): "Evaluation of glottal epoch detection algorithms on different voice types", In INTERSPEECH-2011, 1989-1992.