2nd Workshop on Spoken Language Technologies for Under-Resourced Languages

Universiti Sains, Penang, Malaysia
May 3-5, 2010

Vietnamese Multimodal Social Affects: How Prosodic Attitudes can be Recognized and Confused

Dang-Khoa Mac (1,2), Véronique Aubergé (2), Albert Rilliard (3), Eric Castelli (1)

(1) International Research Center MICA, CNRS-UMI 2954, Hanoi, Vietnam
(2) Laboratory of Informatics of Grenoble (LIG), CNRS, France
(3) LIMSI-CNRS, Orsay, France

Social affective expression is a main part of face-to-face interaction and it is highly linked to the language through the culture. This paper presents a study on Audio-Visual prosodic attitudes in Vietnamese, an under-resourced tonal language. Based on an audio-visual corpus of 16 attitudes, perception experiments were carried out with Vietnamese and French participants. The result analysis shows the relative contribution of audio, visual, and audio-visual information in attitude perception. It also shows how native and non-native listeners recognize and confuse the attitudes, thus allows us to investigate the cultural specificities and cross-cultural common attitudes in Vietnamese.

Index Terms: Audio-visual corpus, Prosodic social affects, Cross-cultural perception, Vietnamese

Full Paper

Bibliographic reference.  Mac, Dang-Khoa / Aubergé, Véronique / Rilliard, Albert / Castelli, Eric (2010): "Vietnamese multimodal social affects: how prosodic attitudes can be recognized and confused", In SLTU-2010, 24-28.