Cepstral-based acoustic cues for disordered voices analysis have been investigated in a number of studies. It has been shown that cepstral-based acoustic cues such as the harmonics-to-noise ratio (HNR), the amplitude of the first rhamonic (R1A) provide acoustic correlates for hoarse voice quality. The aim of this presentation is to investigate an acoustic analysis of speech by means of spectral acoustic cues obtained via empirical mode decomposition (EMD) of the log of the magnitude spectrum of the speech signal as an alternative to the cepstral-based acoustic cues. The spectral acoustic cues investigated in this article are the harmonics-to-noise ratio (HNR) and the amplitude of the first harmonic (H1A). The EMDbased spectral acoustic cues are evaluated on a corpus of synthetic stimuli generated by a synthesizer of disordered voices as well as on a corpus of natural sustained vowels comprising 251 normophonic and dysphonic speakers. The performances of the EMD-based spectral acoustic cues (HNREMD and H1A) are compared to those of cepstral-based acoustic cues (HNRceps and R1A). Experimental results show that the EMD-based spectral acoustic cues outperform the cepstral-based acoustic measures in terms of the correlation with the perceived degree of hoarseness defined as the global quality of the voice and provided by the grade (G) in the GRBAS scale.
Bibliographic reference. Kacha, Abdellah / Grenez, Francis / Schoentgen, Jean (2013): "Empirical mode decomposition-based spectral acoustic cues for disordered voices analysis", In INTERSPEECH-2013, 3632-3636.