11th Annual Conference of the International Speech Communication Association

Makuhari, Chiba, Japan
September 26-30. 2010

Automatic Perceptual Categorization of Disordered Connected Speech

Ali Alpan (1), Jean Schoentgen (1), Youri Maryn (2), Francis Grenez (1)

(1) Université Libre de Bruxelles, Belgium
(2) Sint-Jan General Hospital, Belgium

The objective of the presentation is to report experiments involving the automatic classification of disordered connected speech into binary (normal, pathological) or multiple (modal, moderately hoarse, severely hoarse) categories. The multi-category classification according to the perceived degree of hoarseness is considered to be clinically meaningful and desirable given that the reliable perceptual classification by humans of disordered voice stimuli is known to be difficult and time-consuming. The acoustic cues are temporal signal-to-dysperiodicity ratios as well as mel-frequency cepstral coefficients. The classifiers are support vector machines which have been trained and tested on two connected speech corpora. The binary classification accuracy has been high (98%) for both sets of acoustic cues. The multi-category classification accuracy has been 70% when based on signal-to-dysperiodicity ratios and 59% when based on mel-frequency cepstral coefficients.

Full Paper

Bibliographic reference.  Alpan, Ali / Schoentgen, Jean / Maryn, Youri / Grenez, Francis (2010): "Automatic perceptual categorization of disordered connected speech", In INTERSPEECH-2010, 2574-2577.