ISCA Archive PSP 2005
ISCA Archive PSP 2005

Weighting and waiting: experience mediated cue integration in spoken word recognition

Meghan Clayards, Richard Aslin, Michael Tanenhaus

Phonemic contrasts are signalled by multiple acoustic cues, with any one cue often being ambiguous. Cues must be evaluated with respect to each other due to trading relations and may also vary in reliability across different contexts, making it necessary for cues to be integrated and weighted. In vision, cues are weighted according to their reliability (Jacobs, 2002), with cue weights continually updated by recent experience. We evaluated the effects of experience in cue weighting of speech stimuli by varying the reliability of two cues to word medial voicing (preceding vowel length and closure duration). During a short training phase (15-20 min), subjects heard a word as they viewed a four picture display and clicked on the named picture. Training words had voiced or unvoiced medial consonants (e.g., ‘bubble’ and ‘rocket’). The minimal pairs were not words (e.g., ‘bupple’ and ‘rogget’) allowing the lexical bias to provide implicit training. For one group, preceding vowel length was a reliable cue (predicted voicing) and closure duration was not. For the second group, the reverse was true. During testing, we monitored subject's eye movements as they performed the same 4AFC task. They now heard pairs which were temporarily ambiguous between the voiced and unvoiced alternatives(e.g., ‘baker’ and ‘bagel’). The display contained pictures consistent with both alternatives, and the stimuli used a range of vowel and closure durations. Because lexical candidates are evaluated continuously as the acoustic signal unfolds, and acoustic cues arrive asynchronously, relative proportion of looks to the alternative pictures should reflect the strength of the subject's commitment to each cue over time. Initial biases should be towards the candidate most consistent with the vowel duration cue. Later looks should be to the candidate consistent with the closure duration cue. As predicted, early looks were consistent with biases generated by the vowel duration cue and later looks were consistent with biases generated by the closure duration cue. Crucially, subjects who heard the useful vowel duration cue in training maintained the vowel duration bias for longer than subjects who heard the useful closure duration cue in training. Our results demonstrate that listeners weight probabilistic acoustic cues according to their reliability, updating the weighting of those cues to reflect recent experience.

R. A. Jacobs (2002). What determines visual cue reliability? TRENDS in Cognitive Sciences, 6:8, 345-350.


Cite as: Clayards, M., Aslin, R., Tanenhaus, M. (2005) Weighting and waiting: experience mediated cue integration in spoken word recognition. Proc. ISCA Workshop on Plasticity in Speech Perception (PSP 2005), 69 (abstract)

@inproceedings{clayards05_psp,
  author={Meghan Clayards and Richard Aslin and Michael Tanenhaus},
  title={{Weighting and waiting: experience mediated cue integration in spoken word recognition}},
  year=2005,
  booktitle={Proc. ISCA Workshop on Plasticity in Speech Perception (PSP 2005)},
  pages={69 (abstract)}
}