ISCA Archive AVSP 2019
ISCA Archive AVSP 2019

Neural processing of degraded speech using speaker’s mouth movement

Tomomi Mizuochi-Endo, Michiru Makuuchi

Previous studies reported that visual speech cues enhance speech perception ability, but the critical contribution of the brain areas to successful AV integration for degraded speech is still unclear. To clarify this, we performed an fMRI study on word perception, using noise vocoded speech with clips showing a speaker’s face. We recruited 17 right-handed healthy adults, who were presented short video clips in which a Japanese male read aloud Japanese 3-mora nouns. The sounds were noise vocoded at 16 and 32 bands. The experimental conditions were designed as a 2x2 factorial design, crossing Modality (audioonly, A / audio-visual, AV) and Intelligibility (16 / 32 -bands). In both AV and A conditions, the sound and the clip were presented, but in the A conditions the speaker’s mouth was blurred. During fMRI scanning, the participants were instructed to choose one word from a forced-choice probe with four alternatives after they listened/watched the sound and clip. Trial time courses in each model were estimated in the posterior STS, the lip motor area, and the inferior frontal gyrus in the left hemisphere. Behavioral data revealed that successful AV integration improved participants’ performance, in line with previous studies. Imaging data showed that the brain network associated with speech processing activated differently in time-course depending on both the modality and intelligibility of speech perception.


doi: 10.21437/AVSP.2019-12

Cite as: Mizuochi-Endo, T., Makuuchi, M. (2019) Neural processing of degraded speech using speaker’s mouth movement. Proc. The 15th International Conference on Auditory-Visual Speech Processing, 57-62, doi: 10.21437/AVSP.2019-12

@inproceedings{mizuochiendo19_avsp,
  author={Tomomi Mizuochi-Endo and Michiru Makuuchi},
  title={{Neural processing of degraded speech using speaker’s mouth movement}},
  year=2019,
  booktitle={Proc. The 15th International Conference on Auditory-Visual Speech Processing},
  pages={57--62},
  doi={10.21437/AVSP.2019-12}
}