Auditory-Visual Speech Processing 2007 (AVSP2007)
Kasteel Groenendaal, Hilvarenbeek, The Netherlands
Many perceptual experiments show that human talkers provide more intelligible visual speech than synthetic talkers. This inferiority of synthetic visual speech might be due to a lack of finer modeling of the parts of the face that are important to lipreading or that some parts of the face that are not generally considered as relevant to visual speech or as not visible in face-to-face communication, might actually provide some information, which humans are capable of decoding. This information might therefore not be modeled accurately in the synthetic speaker. In this paper, we provide evidence from Arabic that some sounds, which are not known as visible, might be recognized correctly visually. We performed a lipreading recognition experiment on Arabic, where a set of consonant-vowel stimuli were presented as visual-only speech and participants were asked to report what they recognized. The resulting consonant confusion matrix shows that some of these pharyngeals were, to some extent, well discriminated. Results are discussed based on the category of phonemes and the vowel context.
Bibliographic reference. Ouni, Slim / Oun, Kais (2007): "Arabic pharyngeals in visual speech", In AVSP-2007, paper P25.