Sixth International Conference on Spoken Language Processing
Automatic speechreading systems through their use of visual information to support the acoustic signal have been shown to yield better recognition performance than purely acoustic systems, especially when background noise is present. In this paper an answer is sought to the most important questions of speechreading: Which features can represent visual information well? How can they be extracted? Well-known geometric moments are discussed as a means of visual speech representation. Proposed image ellipse axes are shown to be robust and computationally simple features for describing the shape of lips. An intelligibility study was carried out to see which part of the face gives the most support to speechreading. The whole face, mouth or lips were visible dubbed with noisy voice. Visual support to speech perception of the image ellipse model is compared to that of the parts of the natural face.
Bibliographic reference. Czap, László (2000): "Lip representation by image ellipse", In ICSLP-2000, vol.4, 93-96.