Interspeech'2005 - Eurospeech

Lisbon, Portugal
September 4-8, 2005

A System for Audio-Visual Speech Recognition

I. Shdaifat (1), R.-R. Grigat (2)

(1) Collman Special Machines, Germany; (2) TU Hamburg Harburg, Germany

In this work, a system of audio visual speech recognition will be presented. A new hybrid visual feature combination, which is suitable for audio -visual speech recognition was implemented. The features comprise both the shape and the appearance of lips, the dimensional reduction is applied using discrete cosine transform (DCT). A large visual speech database of the German language has been assembled, the German Audio-Visual Database (GAVD). The conducted experiments using only visual features resulted in a high recognition accuracy and improved the audio-visual speech recognition drastically.

Full Paper

Bibliographic reference.  Shdaifat, I. / Grigat, R.-R. (2005): "A system for audio-visual speech recognition", In INTERSPEECH-2005, 1197-1200.