Interspeech'2005 - Eurospeech
Automatic analysis of tongue movement in large existing cineradiographic databases can provide valuable information to understood speech production. We describe here a method for semi-automatic extraction of articulatory information from video observation in order to derive quasi-automatically a geometrical parameterization of the vocal tract movements. The algorithm starts with a limited manual processing step consisting in marking 10 points (12 degrees of freedom) on 100 chosen key images. The treatment on the whole sequence is then automatic thanks to a retro-marking method. At first, the whole database is indexed via a similarity measure performed with the key images. Then, we associate on the original images the geometrical information recovered on the key images via this indexing. Different complementary error reduction methods are also proposed. Averaging geometrical configurations of a neighborhood, temporal filtering and spline interpolation allow to reduce the reconstruction error to about 10 pixels for a tongue contour of average length of 260 pixels.
Bibliographic reference. Fontecave, Julie / Berthommier, Frédéric (2005): "Quasi-automatic extraction of tongue movement from a large existing speech cineradiographic database", In INTERSPEECH-2005, 1081-1084.