Auditory-Visual Speech Processing (AVSP'99)
August 7-10, 1999
This paper presents the application of fuzzy set theory to automatic computer lip-reading from video images. Simple rules based on fuzzy sets were generated using the mass assignment theory and were used for automatic feature extraction from video sequences. Probabilistic grid models were used to derive a knowledge base representing the visual data for phonemes or sounds. Phonemes from a medium sized vocabulary of words were used for training and testing and a reasonable accuracy for classification was achieved. The methods were also applied to the Tulips1 database and the results illustrate that the learning techniques are efficient and general enough to be applied to different speakers.
Bibliographic reference. Baldwin, James F. / Martin, Trevor P. / Saeed, Mehreen (1999): "Automatic computer lip-reading using fuzzy set theory", In AVSP-1999, paper #14.