Audio-Visual Automatic Speech Recognition offers to make speech recognition possible in noisy environments. Early and late fusion approaches dominate the field but may ignore linguistically relevant features. Distinctive features offer an alternative unit for fusion and research has shown that this is feasible on subsets of phonemes . This paper outlines two extended models, multiclass and binary, and results suggest that it is possible to achieve a 20dB gain over audio-only recognition in low SNR environments.
Bibliographic reference. Lewis, Trent W. / Powers, David M. W. (2008): "Distinctive feature fusion for recognition of australian English consonants", In INTERSPEECH-2008, 2671-2674.