State of the art objective measures for quantifying voice quality mostly consider estimation of features extracted from the magnitude spectrum. Assuming that speech is obtained by exciting a minimum-phase (vocal tract filter) and a maximum-phase component (glottal source), the amplitude spectrum cannot capture the maximum phase characteristics. Since voice quality is connected to the glottal source, the extracted features should be linked with the maximum-phase component of speech. This work proposes a new metric based on the phase spectrum for characterizing the maximum-phase component of the glottal source. The proposed feature, the Phase Distortion Deviation, reveals the irregularities of the glottal pulses and therefore, can be used for detecting voice disorders. This is evaluated in a ranking problem of speakers with spasmodic dysphonia. Results show that the obtained ranking is highly correlated with the subjective ranking provided by doctors in terms of overall severity, tremor and jitter. The high correlation of the suggested feature with different metrics reveals its ability to capture voice irregularities and highlights the importance of the phase spectrum in voice quality assessment.
Bibliographic reference. Koutsogiannaki, Maria / Simantiraki, Olympia / Degottex, Gilles / Stylianou, Yannis (2014): "The importance of phase on voice quality assessment", In INTERSPEECH-2014, 1653-1657.