7th International Conference on Spoken Language Processing

September 16-20, 2002
Denver, Colorado, USA

Design of an Audio-Visual Speech Corpus for the Czech Audio-Visual Speech Synthesis

Milos Zelezný, Petr Císar, Zdenek Krnoul, Jan Novák

University of West Bohemia in Pilsen, Czech Republic

Our long-term goal is to design a system for the Czech visual synthesis, that means an animated synthetic face (often called talking head) imitating pronouncing of a speech by a human being. In this paper we present techniques used for acquiring data and building the audio-visual speech corpus, especially its visual part. This process involves the recording of stereoscopic video data and solving of related problems as synchronization. Apart from that, we present simple method of utilization of such corpus using stereo vision principles and modelling shape of the lips by simple triangular mesh.

Full Paper

Bibliographic reference.  Zelezný, Milos / Císar, Petr / Krnoul, Zdenek / Novák, Jan (2002): "Design of an audio-visual speech corpus for the czech audio-visual speech synthesis", In ICSLP-2002, 1941-1944.