INTERSPEECH 2009
10th Annual Conference of the International Speech Communication Association

Brighton, United Kingdom
September 6-10, 2009

Unsupervised Training Scheme with Non-Stereo Data for Empirical Feature Vector Compensation

L. Buera (1), Antonio Miguel (1), Alfonso Ortega (1), Eduardo Lleida (1), Richard M. Stern (2)

(1) Universidad de Zaragoza, Spain
(2) Carnegie Mellon University, USA

In this paper, a novel training scheme based on unsupervised and non-stereo data is presented for Multi-Environment Model-based LInear Normalization (MEMLIN) and MEMLIN with cross-probability model based on GMMs (MEMLIN-CPM). Both are data-driven feature vector normalization techniques which have been proved very effective in dynamic noisy acoustic environments. However, this kind of techniques usually requires stereo data in a previous training phase, which could be an important limitation in real situations. To compensate this drawback, we present an approach based on ML criterion and Vector Taylor Series (VTS). Experiments have been carried out with Spanish SpeechDat Car, reaching consistent improvements: 48.7% and 61.9% when the novel training process is applied over MEMLIN and MEMLIN-CPM, respectively.

Full Paper

Bibliographic reference.  Buera, L. / Miguel, Antonio / Ortega, Alfonso / Lleida, Eduardo / Stern, Richard M. (2009): "Unsupervised training scheme with non-stereo data for empirical feature vector compensation", In INTERSPEECH-2009, 1247-1250.