Auditory-Visual Speech Processing (AVSP) 2010

Hakone, Kanagawa, Japan
September 30-October 3, 2010



Bibliographic Reference

[AVSP-2010] Auditory-Visual Speech Processing (AVSP) 2010, Hakone, Kanagawa, Japan, September 30-October 3, 2010; ISCA Archive, http://www.isca-speech.org/archive/av10


Introduction to the Conference


Author Index and Quick Access to Abstracts

Abdou   Ananthakrishnan   Andersen (P13)   Andersen (S2-1)   Andersen (S2-3)   Attina   Badin   Bailly   Berger   Berthommier   Best   Bowden   Burnham (P18)   Burnham (S4-2)   Callan   Chaloupka   Chew   Colotte   Cox   CSIBRA   Cvejic   Davis (S3-1)   Davis (S3-2)   Dean   Denda   Engwall (S2-2)   Engwall (S7-2)   Eskelund   Fookes (P7)   Fookes (S1-1)   Fordyce   Fujimoto   de Gelder (P17)   de Gelder (S3-3)   Gibert (P12)   Gibert (S4-2)   Hagita (P2)   Hagita (P5)   Haraikawa   Harvey (S7-3)   Harvey (S8-2)   Hashiba   Hashimoto   Hayamizu (P4)   Hayamizu (S1-4)   Hayes   Heracleous   Hilder   Hiramatsu (P17)   Hiramatsu (S3-3)   Hiramoto (P17)   Hiramoto (S3-3)   Hiroe   Hisanaga (S4-1)   Hisanaga (S5-3)   Igasaki   Imai (P17)   Imai (S3-3)   Irwin   Ishi (P2)   Ishi (P5)   Ishikawa   Joeffry   Kanekama   Kim (S3-1)   Kim (S3-2)   Kim (S6-1)   Kiriyama (P14)   Kiriyama (P15)   Kitamura   Kitaoka   Kitazawa (P14)   Kitazawa (P15)   KOBAYASHI   Kodama   Koizumi (P17)   Koizumi (S3-3)   Konishi   Kraus   Kroos   Kuratate   Kuroiwa   Lan   Lao   Latacz   Lucey (P7)   Lucey (S1-1)   Matsuda   Mattheyses   Miyajima   Miyazawa   Möttönen   Murayama   Musti   Nahorna   Nakadai   Nakamura   Nakayama   Navarathna   Newman   Nishimoto   Nishiura   Nouza   Ogawa   Ong   Ouni   Picard   Rice   Riley   Saitoh   Sakamoto   Sakaue   Samejima   Sams   Sasou   Sato, Masa-aki   Sato, Miki (P2)   Sato, Miki (P5)   Sato, Takao   Schwartz (P11)   Schwartz (S2-1)   Sekiyama (S4-1)   Sekiyama (S5-2)   Sekiyama (S5-3)   Shen   Shibata   Shinozaki   Sridharan (P7)   Sridharan (S1-1)   Stevens   Takebayashi (P14)   Takebayashi (P15)   Takeda   Takeuchi   Takiguchi   Tamura (P4)   Tamura (P6)   Tamura (S1-4)   Tanaka (P16)   Tanaka (P17)   Tanaka (S3-3)   Theobald (P1)   Theobald (S7-3)   Theobald (S8-2)   Tiippana (P10)   Tiippana (S2-1)   Toutios   Tsuge   Tuomainen   Vatikiotis-Bateson   Verhelst   Wik   Wrobel-Dautcourt   Yamada, Takao   Yamada, Takeshi   Yamamoto   Yoshida   Yoshioka   Youssef   Yumoto  

Names written in boldface refer to first authors, in CAPITAL letters to keynote and invited papers. Full papers can be accessed from the abstracts. Please note that each abstract opens in a separate window.



Table of Contents and Access to Abstracts

Keynotes

Kobayashi, Tetsunori: "Robot as a multimodal human interface device", paper K1.

Csibra, Gergely: "What do human infants expect when adults communicate to them?", paper K2.

Recognition

Navarathna, Rajitha / Dean, David / Lucey, Patrick / Sridharan, Sridha / Fookes, Clinton: "Cascading appearance-based features for visual voice activity detection", paper S1-1.

Yoshida, Takami / Nakadai, Kazuhiro: "Audio-visual speech recognition system for a robot", paper S1-2.

Chaloupka, Josef / Nouza, Jan: "Audio-visual television broadcast programs processing, transcription, indexing and searching", paper S1-3.

Takeuchi, Shin'ichi / Hashiba, Takashi / Tamura, Satoshi / Hayamizu, Satoru: "Decision fusion by boosting method for multi-modal voice activity detection", paper S1-4.

Saitoh, Takeshi / Konishi, Ryosuke: "A study of influence of word lip reading by change of frame rate", paper S7-1.

Picard, Sébastien / Ananthakrishnan, G. / Wik, Preben / Engwall, Olov / Abdou, Sherif: "Detection of specific mispronunciations using audiovisual features", paper S7-2.

Lan, Yuxuan / Theobald, Barry-John / Harvey, Richard / Ong, Eng-Jon / Bowden, Richard: "Improving visual features for lip-reading", paper S7-3.

Perception - McGurk Effect

Schwartz, Jean-Luc / Tiippana, Kaisa / Andersen, Tobias S.: "Disentangling unisensory from fusion effects in the attentional modulation of Mcgurk effects: a Bayesian modeling study suggests that fusion is attention-dependent", paper S2-1.

Engwall, Olov: "Is there a mcgurk effect for tongue reading?", paper S2-2.

Andersen, Tobias S.: "The Mcgurk illusion in the oddity task", paper S2-3.

Emotion, Prosody

Cvejic, Erin / Kim, Jeesun / Davis, Chris: "Abstracting visual prosody across speakers and face areas", paper S3-1.

Kim, Jeesun / Davis, Chris: "Emotion perception by eye and ear and halves and wholes", paper S3-2.

Tanaka, Akihiro / Koizumi, Ai / Imai, Hisato / Hiramatsu, Saori / Hiramoto, Eriko / Gelder, Beatrice de: "Cross-cultural differences in the multisensory perception of emotion", paper S3-3.

Perception

Kanekama, Yori / Hisanaga, Satoko / Sekiyama, Kaoru / Kodama, Narihiro / Samejima, Yasuhiro / Yamada, Takao / Yumoto, Eiji: "Long-term cochlear implant users have resistance to noise, but short-term users don’t", paper S4-1.

Attina, Virginie / Gibert, Guillaume / Vatikiotis-Bateson, Eric / Burnham, Denis: "Production of Mandarin lexical tones: auditory and visual components", paper S4-2.

Recognition, Synthesis (Poster Session)

Newman, Jacob L. / Theobald, Barry-John / Cox, Stephen J.: "Limitations of visual speech recognition", paper P1.

Heracleous, Panikos / Sato, Miki / Ishi, Carlos T. / Hagita, Norihiro: "Investigating the role of the Lombard reflex in visual and audiovisual speech recognition", paper P2.

Sasou, Akira / Hashimoto, Yasuharu / Sakaue, Katsuhiko: "Acoustic head gesture recognition and its applications", paper P3.

Shen, Peng / Tamura, Satoshi / Hayamizu, Satoru: "Evaluation of real-time audio-visual speech recognition", paper P4.

Ishi, Carlos T. / Sato, Miki / Hagita, Norihiro / Lao, Shihong: "Real-time audio-visual voice activity detection for speech recognition in noisy environments", paper P5.

Tamura, Satoshi / Miyajima, Chiyomi / Kitaoka, Norihide / Yamada, Takeshi / Tsuge, Satoru / Takiguchi, Tetsuya / Yamamoto, Kazumasa / Nishiura, Takanobu / Nakayama, Masato / Denda, Yuki / Fujimoto, Masakiyo / Matsuda, Shigeki / Ogawa, Tetsuji / Kuroiwa, Shingo / Takeda, Kazuya / Nakamura, Satoshi: "CENSREC-1-AV: an audio-visual corpus for noisy bimodal speech recognition", paper P6.

Chew, Sien W. / Lucey, Patrick / Sridharan, Sridha / Fookes, Clinton: "Exploring visual features through Gabor representations for facial expression detection", paper P7.

Toutios, Asterios / Musti, Utpala / Ouni, Slim / Colotte, Vincent / Wrobel-Dautcourt, Brigitte / Berger, Marie-Odile: "Towards a true acoustic-visual speech synthesis", paper P8.

Kuratate, Takaaki / Riley, Marcia: "Building speaker-specific lip models for talking heads from 3d face data", paper P9.

Perception - Brain

Callan, Daniel E.: "Brain regions differentially involved with multisensory and visual only speech gesture information", paper S5-1.

Shinozaki, Jun / Sekiyama, Kaoru / Hiroe, obuo / Yoshioka, Taku / Sato, Masa-aki: "Impact of language on audiovisual speech perception examined by fMRI", paper S5-2.

Hisanaga, Satoko / Sekiyama, Kaoru / Igasaki, Tomohiko / Murayama, Nobuki: "An ERP examination of audiovisual speech perception in Japanese younger and older adults", paper S5-3.

Session 6: Perception - Infants

Kitamura, Christine / Kim, Jeesun: "Infants match auditory and visual speech in schematic point-light displays", paper S6-1.

Best, Catherine T. / Kroos, Christian / Irwin, Julia: "I can see what you said: infant sensitivity to articulator congruency between audio-only and silent-video presentations of native and nonnative consonants", paper S6-2.

Synthesis

Mattheyses, Wesley / Latacz, Lukas / Verhelst, Werner: "Optimized photorealistic audiovisual speech synthesis using active appearance modeling", paper S8-1.

Hilder, Sarah / Theobald, Barry-John / Harvey, Richard: "In pursuit of visemes", paper S8-2.

Youssef, Atef Ben / Badin, Pierre / Bailly, Gérard: "Acoustic-to-articulatory inversion in speech based on statistical models", paper S8-3.

Perception, Emotion, Interaction (Poster Session)

Tiippana, Kaisa / Hayes, Erin / Möttönen, Riikka / Kraus, Nina / Sams, Mikko: "The Mcgurk effect at various auditory signal-to-noise ratios in american and Finnish listeners", paper P10.

Nahorna, Olha / Berthommier, Frédéric / Schwartz, Jean-Luc: "Binding and unbinding in audiovisual speech fusion: removing the Mcgurk effect by an incoherent preceding audiovisual context", paper P11.

Gibert, Guillaume / Fordyce, Andrew / Stevens, Catherine J.: "Role of form and motion information in auditory-visual speech perception of Mcgurk combinations and fusions", paper P12.

Eskelund, Kasper / Tuomainen, Jyrki / Andersen, Tobias S. / , Nina: "Speech-specificity of two audiovisual integration effects", paper P13.

Ishikawa, Shogo / Kiriyama, Shinya / Takebayashi, Yoichi / Kitazawa, Shigeyoshi: "The multimodal analysis for understanding child behavior focused on attention-catching", paper P14.

Shibata, Kenichi / Kiriyama, Shinya / Haraikawa, Tomohiro / Takebayashi, Yoichi / Kitazawa, Shigeyoshi: "A study of speech interface for living space adapting to user environment by considering scenery situation", paper P15.

Miyazawa, Shiho / Tanaka, Akihiro / Sakamoto, Shuichi / Nishimoto, Takehiko: "Effects of speech-rate conversion on asynchrony perception of audio-visual speech", paper P16.

Koizumi, Ai / Tanaka, Akihiro / Imai, Hisato / Hiramatsu, Saori / Hiramoto, Eriko / Sato, Takao / Gelder, Beatrice de: "The effects of anxiety on the perception of emotion in the face and voice", paper P17.

Burnham, Denis / Joeffry, Sebastian / Rice, Lauren: ""d-o-e-s-not-c-o-m-p-u-t-e”: vowel hyperarticulation in speech to an auditory-visual avatar", paper P18.