ESCA Workshop on Audio-Visual Speech Processing (AVSP'97)

September 26-27, 1997
Rhodes, Greece

Elucidating The Complex Relationships Between Phonetic Perception and Word Recognition In Audiovisual Speech Perception

Z. E. Bernstein, P. Iverson, E. T. Auer Jr.

Spoken Language Processes Laboratory, House Ear Institute, Los Angeles, CA, USA

This paper reports studies on the relationship between form-based (phonetic) word similarity, particularly at the level typically employed for deriving visemes, and word identification. Three experiments were conducted to (1) obtain phoneme identifications, and then (2) investigate word homopheny, and (3) open-set word identification. Computational methods were employed to predict word similarity based on phoneme identifications. The results showed that the viseme level is inadequate to predict word homopheny. Results also showed that within conditions, the relative accuracy of word identification is related to the number of potentially ambiguous words in the lexicon.

