ISCA Workshop on Plasticity in Speech Perception (PSP2005)

Senate House, London, UK
June 15-17, 2005

Perceptual learning of noise vocoded words

Alexis Hervais-Adelman, Matt Davis, Robert P. Carlyon

MRC Cognition and Brain Sciences Unit, UK

Noise-vocoded (NV) speech is a spectrally reduced form of speech which can simulate perception by cochlear implant users. Although initially unintelligible, listeners can correctly report 70% of words in NV sentences after 30 minutes of training with English sentences, but not after training with nonword sentences. The current work explores the processes involved in learning to understand single NV words: (i) does perceptual learning generalise to untrained words, (ii)and is learning affected by providing feedback, (iii) can participants learn from training with nonwords and (iv) does training improve discrimination, or simply modify subjects' report strategies? Using NV words rather than sentences reduces the role of context and shortPSP2005 term memory (STM) in the learning and report process, permitting a more detailed assessment of factors affecting the perception of NV speech. Experiment 1: 20 na´ve volunteers were asked to repeat two groups of 60 NV words. Following each response, listeners received feedback: they heard either the NV ('distorted') word repeated and then the same word clearly (DC), or the word clearly then distorted (CD). Recognition was more accurate on the second group of NV words, showing that perceptual-learning generalises to untrained lexical items. Consistent with sentence studies, subjects who received CD feedback performed significantly better than those receiving DC feedback. Knowledge of the identity of NV words speeds perceptual learning, consistent with involvement of top-down processes. Experiment 2: 24 na´ve listeners took part in a crossover study; training materials were blocks of 60 distorted words or non-words with DC feedback. Subjects were tested for recognition on different blocks of 40 NV words before training, after one type of training and after both. Training with words was significantly more effective than training with non-words, suggesting that the lack of learning with nonword sentences does not only reflect limited STM capacity. Experiment 3: 32 na´ve participants were tested on reporting 20 NV words, and on a 2AFC phoneme-discrimination test for 40 NV words, both at baseline (before training), and after training with varying numbers of NV words with CD feedback. Increased amounts of training improved report scores, but reliable changes in discrimination performance were unrelated to the amount of training exposure provided, suggesting that improved report scores do not only reflect improved discrimination. In conclusion, learning to report NV words depends a top-down learning mechanism which operates more effectively when the identity (Experiment 1) and lexicality (Experiment 2) of the training stimuli is known.


Bibliographic reference.  Hervais-Adelman, Alexis / Davis, Matt / Carlyon, Robert P. (2005): "Perceptual learning of noise vocoded words", In PSP2005, 76.