8th Annual Conference of the International Speech Communication Association

Antwerp, Belgium
August 27-31, 2007

Discrimination and Recognition of Scaled Word Sounds

Toshio Irino (1), Yoshie Aoki (1), Yoshie Hayashi (1), Hideki Kawahara (1), Roy D. Patterson (2)

(1) Wakayama University, Japan
(2) University of Cambridge, UK

Smith et al. [2] and Ives et al. [3] demonstrated that humans could extract information about the size of a speaker's vocal tract from speech sounds (vowels and syllables, respectively). We have extended their discrimination and recognition experiments to naturally pronounced words. The Just Noticeable Difference (JND) for size discrimination was between 5.5% and 19% depending on the listener. The smallest JND is comparable to that of the syllable experiments; the average JND is comparable to that of the vowel experiments. The word recognition scores remain above 50% for speaker sizes beyond the normal range for humans. The fact that good performance extends over such a large range of acoustic scales supports Irino and Patterson's hypothesis [1] that the auditory system segregates size and shape information at an early stage in the processing.

Full Paper

Bibliographic reference.  Irino, Toshio / Aoki, Yoshie / Hayashi, Yoshie / Kawahara, Hideki / Patterson, Roy D. (2007): "Discrimination and recognition of scaled word sounds", In INTERSPEECH-2007, 378-381.