Auditory-Visual Speech Processing (AVSP) 2011

Volterra, Italy
September 1-2, 2011

Bibliographic Reference

[AVSP-2011] Auditory-Visual Speech Processing (AVSP) 2011, Volterra, Italy, September 1-2, 2011; ed. by Giampiero Salvi, Jonas Beskow, Olov Engwall, abd Samer Al Moubayed; ISCA Archive,

Introduction to the Workshop

Author Index and Quick Access to Abstracts

Al_Moubayed (71)   Al_Moubayed (99)   Al_Moubayed (107)   Alexandersson (71)   Alexandersson (107)   van Amelsvoort   Andersen   Arimoto   Attina   Berthommier (21)   Berthommier (77)   Beskow (71)   Beskow (107)   Best   Borràs-Comes   Burnham   de Campos   Cheng   Christmas   Colotte   Cosi   Cox   Cvejic   Czap (69)   Czap (137)   Davis (5)   Davis (15)   Davis (31)   Davis (73)   Edlund   Fagel   Fitzpatrick   Galatas   Granström   Hagita   House   Huang   Irwin   Ishi   Ishiguro   Jiang   Joosten   Kasisopa   Kim (5)   Kim (15)   Kim (31)   Kim (73)   Kittler   Kosmopoulos   Krahmer (25)   Krahmer (87)   Kroos   Kuratate   Latacz   Leone   Liu   Makedon   MASSEY   Mattheyses   Mátyás   McMurrough   Musti   Nahorna   Okanoya   Ouni   Paris   Pierce   Postma   Potamianos   Prieto   Pugliesi   Sahli   Saitoh   Schwartz (21)   Schwartz (77)   SJÖLANDER   Skantze   Swerts   Toutios   Verhelst   Visser   Windridge   Wu   Yan   Zhang  

Names written in boldface refer to first authors, in CAPITAL letters to keynote and invited papers. Full papers can be accessed from the abstracts. Please note that each abstract opens in a separate window.

Table of Contents and Access to Abstracts

Keynote Papers

Sjölander, Sverre: "Acoustical and visual processing in the animal kingdom", 1 (abstract).

Massey, Colm: "From actor to avatar: real world challenges in capturing the human face", 3 (abstract).


Paris, Tim / Kim, Jeesun / Davis, Chris: "Visual speech influences speeded auditory identification", 5-8.

Best, Catherine T. / Kroos, Christian / Irwin, Julia: "Do infants detect a-v articulator congruency for non-native click consonants?", 9-14.

Cvejic, Erin / Kim, Jeesun / Davis, Chris: "Perceiving visual prosody from point-light displays", 15-20.

Nahorna, Olha / Berthommier, Frédéric / Schwartz, Jean-Luc: "Binding and unbinding the Mcgurk effect in audiovisual speech fusion: follow-up experiments on a new paradigm", 21-24.

Visser, Mandy / Krahmer, Emiel / Swerts, Marc: "Children’s expression of uncertainty in collaborative and competitive contexts", 25-30.

Fitzpatrick, Michael / Kim, Jeesun / Davis, Chris: "The effect of seeing the interlocutor on auditory and visual speech production in noise", 31-35.

Burnham, Denis / Attina, Virginie / Kasisopa, Benjawan: "Auditory-visual discrimination and identification of lexical tone within and across tone languages", 37-42.

Borràs-Comes, Joan / Pugliesi, Cecilia / Prieto, Pilar: "Audiovisual perception of counter-expectational questions", 43-47.


Musti, Utpala / Colotte, Vincent / Toutios, Asterios / Ouni, Slim: "Introducing visual target cost within an acoustic-visual unit-selection speech synthesizer", 49-55.

Mattheyses, Wesley / Latacz, Lukas / Verhelst, Werner: "Auditory and photo-realistic audiovisual speech synthesis for Dutch", 55-60.

Wu, Peng / Jiang, Dongmei / Zhang, He / Sahli, Hichem: "Photo-realistic visual speech synthesis based on AAM features and an articulatory DBN model with constrained asynchrony", 61-66.

Perception and Modeling

Kim, Jeesun / Davis, Chris: "Audiovisual speech processing in visual speech noise", 73-76.

Berthommier, Frédéric / Schwartz, Jean-Luc: "Audiovisual streaming in voicing perception: new evidence for a low-level interaction between audio and visual modalities", 77-80.

Andersen, Tobias S.: "An ordinal model of the Mcgurk illusion", 81-86.

Joosten, Bart / Amelsvoort, Marije van / Krahmer, Emiel / Postma, Eric: "Thin slices of head movements during problem solving reveal level of difficulty", 87-92.

Arimoto, Yoshiko / Okanoya, Kazuo: "Dimensional mapping of multimodal integration on audiovisual emotion perception", 93-98.

Al Moubayed, Samer / Skantze, Gabriel: "Turn-taking control using gaze in multiparty human-computer dialogue: effects of 2d and 3d displays", 99-102.

Corpora and Applications

Galatas, Georgios / Potamianos, Gerasimos / Kosmopoulos, Dimitrios / McMurrough, Chris / Makedon, Fillia: "Bilingual corpus for AVASR using multiple sensors and depth information", 103-106.

Beskow, Jonas / Alexandersson, Simon / Al Moubayed, Samer / Edlund, Jens / House, David: "Kinetic data for large-scale analysis and modeling of face-to-face conversation", 107-110.

Kuratate, Takaaki / Pierce, Brennard / Cheng, Gordon: "“mask-bot” - a life-size talking head animated robot for AV speech and human-robot communication research", 111-116.

Saitoh, Takeshi: "Development of communication support system using lip reading", 117-122.

Leone, Giuseppe Riccardo / Cosi, Piero: "LUCIA-webGL: a web based Italian MPEG-4 talking head", 123-126.

Analysis and Recognition

Huang, Qiang / Cox, Stephen / Yan, Fei / Campos, Teo de / Windridge, David / Kittler, Josef / Christmas, William: "Improved detection of ball hit events in a tennis game using multimodal information", 127-130.

Ishi, Carlos T. / Liu, Chaoran / Ishiguro, Hiroshi / Hagita, Norihiro: "Speech-driven lip motion generation for tele-operated humanoid robots", 131-135.

Czap, László: "On the audiovisual asynchrony of speech", 137-140.

Demo Session

Fagel, Sascha: "Talking heads for elderly and Alzheimer patients (THEA): project report and demonstration", 67.

Czap, László / Mátyás, János: "Improving naturalness of visual speech synthesis", 69.

Al Moubayed, Samer / Alexandersson, Simon / Beskow, Jonas / Granström, Björn: "A robotic head using projected animated faces", 71.