EUROSPEECH '97
5th European Conference on Speech Communication and Technology

Rhodes, Greece
September 22-25, 1997


Non-Linear Representations, Sensor Reliability Estimation and Context-Dependent Fusion in The Audiovisual Recognition of Speech in Noise

Pascal Teissier (1,2), Jean-Luc Schwartz (1), Anne Guerin-Dugue (2)

(1) Institut de la Communication Parlee CNRS UPRESA 5009 / INPG - U. Stendhal ICP, INPG, Grenoble Cedex, France (2) Laboratoire de Traitement d'Images et de Reconnaissance des Formes LTIRF, INPG, Grenoble Cedex, France

The paper involves the recognition of French audiovisual vowels at various signal-to-noise ratios (SNRs). It deals with a new non-linear preprocessing of the audio data which enables an estimation of the reliability of the audio sensor in relation to SNR, and a significant increase in the recognition performances at the output of the fusion process.

Full Paper

Bibliographic reference.  Teissier, Pascal / Schwartz, Jean-Luc / Guerin-Dugue, Anne (1997): "Non-linear representations, sensor reliability estimation and context-dependent fusion in the audiovisual recognition of speech in noise", In EUROSPEECH-1997, 1611-1614.