The objective of the presentation is to report experiments involving the automatic classification of disordered connected speech into binary (normal, pathological) or multiple (modal, moderately hoarse, severely hoarse) categories. The multi-category classification according to the perceived degree of hoarseness is considered to be clinically meaningful and desirable given that the reliable perceptual classification by humans of disordered voice stimuli is known to be difficult and time-consuming. The acoustic cues are temporal signal-to-dysperiodicity ratios as well as mel-frequency cepstral coefficients. The classifiers are support vector machines which have been trained and tested on two connected speech corpora. The binary classification accuracy has been high (98%) for both sets of acoustic cues. The multi-category classification accuracy has been 70% when based on signal-to-dysperiodicity ratios and 59% when based on mel-frequency cepstral coefficients.
Bibliographic reference. Alpan, Ali / Schoentgen, Jean / Maryn, Youri / Grenez, Francis (2010): "Automatic perceptual categorization of disordered connected speech", In INTERSPEECH-2010, 2574-2577.