ISCA Archive Interspeech 2009
ISCA Archive Interspeech 2009

Audio-visual prosody of social attitudes in vietnamese: building and evaluating a tones balanced corpus

Dang-Khoa Mac, Véronique Aubergé, Albert Rilliard, Eric Castelli

This paper presents the building and a first evaluation of a tones balanced Audio-Visual corpus of social affect in Vietnamese language. This under-resourced tonal language has specific glottalization and co-articulation phenomena, for which interactions with attitudes prosody are a very interesting issue. A well-controlled recording methodology was designed to build a large representative audio-visual corpus for 16 attitudes, and one speaker. A perception experiment was carried out to evaluate a speaker’s perceived performances and to study the role and integration of the audio, visual, and audio-visual information in the listener’s perception of the speaker’s attitudes. The results reveal characteristics of Vietnamese prosodic attitudes and allow us to investigate such social affect in Vietnamese language.


doi: 10.21437/Interspeech.2009-642

Cite as: Mac, D.-K., Aubergé, V., Rilliard, A., Castelli, E. (2009) Audio-visual prosody of social attitudes in vietnamese: building and evaluating a tones balanced corpus. Proc. Interspeech 2009, 2263-2266, doi: 10.21437/Interspeech.2009-642

@inproceedings{mac09_interspeech,
  author={Dang-Khoa Mac and Véronique Aubergé and Albert Rilliard and Eric Castelli},
  title={{Audio-visual prosody of social attitudes in vietnamese: building and evaluating a tones balanced corpus}},
  year=2009,
  booktitle={Proc. Interspeech 2009},
  pages={2263--2266},
  doi={10.21437/Interspeech.2009-642}
}