INTERSPEECH 2015
16th Annual Conference of the International Speech Communication Association

Dresden, Germany
September 6-10, 2015

How the Slope of the Speech Spectrum Affects the Perception of Speaker Size

Kodai Yamamoto (1), Toshio Irino (1), Ryuichi Nisimura (1), Hideki Kawahara (1), Roy D. Patterson (2)

(1) Wakayama University, Japan
(2) University of Cambridge, UK

We performed a behavioral experiment to demonstrate the effect of spectral slope on the perception of speaker size, and we developed an auditory model based on the dynamic compressive gammachirp filterbank (dcGC-FB) to explain the results. STRAIGHT was used to generate “unvoiced” and “whispered” versions of naturally recorded words; the only difference was that the spectral slope of the whispered words was tilted up 6 dB/octave with respect to that of the unvoiced words. The experiment confirmed that the whispered words are heard to come from smaller speakers. The auditory model uses the tonotopic excitation pattern, Ep, as the internal representation of speech sounds. The model is found to be much more effective when the gradient of the excitation pattern, ▽ Ep, is included in the size discrimination process. It is particularly useful for explaining individual subject variability.

Full Paper

Bibliographic reference.  Yamamoto, Kodai / Irino, Toshio / Nisimura, Ryuichi / Kawahara, Hideki / Patterson, Roy D. (2015): "How the slope of the speech spectrum affects the perception of speaker size", In INTERSPEECH-2015, 1556-1560.