September 22-25, 1997
This paper describes recent speechreading experiments for a speaker independent continuous digit recognition task. Visual feature extraction is performed by a lip tracker which recovers information about the lip shape and information about the grey- level intensity around the mouth. These features are used to train visual word models using continuous density HMMs. Results show that the method generalises well to new speakers and that the recognition rate is highly variable across digits as expected due to the high visual confusability of certain words.
Bibliographic reference. Luettin, Juergen (1997): "Towards speaker independent continuous speechreading", In EUROSPEECH-1997, 1991-1994.