A new method to calculate a spectral signal to noise ratio (SNR) in speech signals is presented. The method is based on a source-filter model of voice production, and involves filtering of the magnitude spectrum by means of a fundamental frequency adaptive cepstrum comb-liftering algorithm. The level difference in a certain frequency band between the original, unfiltered spectrum and the filtered (noise) spectrum is defined as the SNR. The method is tested with synthetic /a:/ like signals generated at fundamental frequencies of 110 and 220 Hz, differing in either the relative level of the noise burst, jitter or shimmer factor. SNR values are compared with the SNR values obtained with an adaptation of a method as described by Hiraoka et al. (1984). Results indicate that the method is sensitive to the amount of noise in the signal and the degree of perturbation, especially jitter. Measurements on recordings of normal and pathological voices indicate that the obtained SNR values can be used as one of the acoustical correlates of (pathological) voice quality.
Cite as: Krom, G.d. (1990) A new cepstrum-based technique for the estimation of spectral signal-to-noise ratios in speech signals. Proc. ESCA Workshop on Speaker Characterization in Speech Technology, 83-93
@inproceedings{krom90_scst, author={Guus de Krom}, title={{A new cepstrum-based technique for the estimation of spectral signal-to-noise ratios in speech signals}}, year=1990, booktitle={Proc. ESCA Workshop on Speaker Characterization in Speech Technology}, pages={83--93} }