11th Annual Conference of the International Speech Communication Association

Makuhari, Chiba, Japan
September 26-30. 2010

Estimation of Speech Lip Features from Discrete Cosinus Transform

Zuheng Ming (1), Denis Beautemps (2), Gang Feng (2), Sébastien Schmerber (1)

(1) CHU Michallon, France
(2) GIPSA, France

This study is a contribution to the field of visual speech processing. It focuses on the automatic extraction of Speech lip features from natural lips. The method is based on the direct prediction of these features from predictors derived from an adequate transformation of the pixels of the lip region of interest. The transformation is made of a 2-D Discrete Cosine Transform combined with a Principal Component Analysis applied to a subset of the DCT coefficients corresponding to about 1% of the total DCTs. The results show the possibility to estimate the geometric lip features with a good accuracy (a root mean square of 1 to 1.4 mm for the lip aperture and the lip width) using a reduce set of predictors derived from the PCA.

Full Paper

Bibliographic reference.  Ming, Zuheng / Beautemps, Denis / Feng, Gang / Schmerber, Sébastien (2010): "Estimation of speech lip features from discrete cosinus transform", In INTERSPEECH-2010, 1612-1615.