International Workshop on Hands-Free Speech Communication (HSC2001)

April 9-11, 2001
Kyoto, Japan

Effect of Harmonic Structure of Noises on Noisy Vowel Perception

Kentaro Ishizuka and Kiyoaki Aikawa

NTT Communication Science Laboratories, Atsugi City, Kanagawa, Japan

This paper reports new findings on noisy vowel perception experiments designed to obtain a new feature parameter for noise-robust automatic speech recognition. To obtain this new parameter, we analyze the human auditory mechanism. We conducted two experiments to examine the way in which listeners perceive natural vowels under very noisy environmental conditions, namely a signal to noise ratio (SNR) of around -2 dB. First, we used eigbt types of noise; white noise and seven types of harmonic structured noise each with the same flat spectral envelope and energy. Spectral envelopes have been widely used as the feature parameter for automatic speech recognition. However, our experimental results showed that perceptual identification scores differ significantly depending on the detailed spectral shape of the noise. The result suggests that the human auditory system uses sound features that are more detailed than the spectral envelopes to perceive vowels in noisy environments. The difference between noises suggests that the even harmonic components of a vowel contribute less to noisy vowel perception than the odd harmonic components. Furthermore, the result implies that the human auditory system changes dynamically in its use of time/frequency features corresponding to waveform and spectral structure. Second, we used five types of harmonic structured noise each with a fundamental frequency close to that of the vowels. The result suggests that vowel perception is affected depending on each fundamental frequency of the noise and each SNR.


Full Paper

Bibliographic reference.  Ishizuka, Kentaro / Aikawa, Kiyoaki (2001): "Effect of harmonic structure of noises on noisy vowel perception", In HSC2001, 155-158.