INTERSPEECH 2004 - ICSLP
This contribution is about the method for automatic lips reading from the video picture. The results of this automatic method are used for the next audio-visual speech processing and recognition. The simple image processing method for finding of the human face in the video picture is presented here. The lips are found from the marked human face in the region of interest, where the lips are, with the help of the mathematical gradient method. This gradient method is based on the image histogram. The histogram is computed from the colour value of the region of interest. The first results for visual speech recognition of isolated words are presented in conclusion. The method described here was used for face and lips detection to help speech recognition.
Bibliographic reference. Chaloupka, Josef (2004): "Automatic lips reading for audio-visual speech processing and recognition", In INTERSPEECH-2004, 2505-2508.