INTERSPEECH 2006 - ICSLP
Ninth International Conference on Spoken Language Processing

Pittsburgh, PA, USA
September 17-21, 2006

Feature Normalization Using Smoothed Mixture Transformations

Patrick Kenny, Vishwa Gupta, G. Boulianne, Pierre Ouellet, Pierre Dumouchel

Centre de Recherche Informatique de Montréal, Canada

We propose a method for estimating the parameters of SPLICE-like transformations from individual utterances so that this type of transformation can be used to normalize acoustic feature vectors for speech recognition on an utterance-by-utterance basis in a similar manner to cepstral mean normalization. We report results on an in-house French language multi-speaker database collected while deploying an automatic closed-captioning system for live broadcast news. An unusual feature of this database is that there are very large amounts of training data for the individual speakers (typically several hours) so that it is very difficult to improve on multi-speaker modeling by using standard methods of speaker adaptation. We found that the proposed method of feature normalization is capable of achieving a 6% relative improvement over cepstral mean normalization on this task.

Full Paper

Bibliographic reference.  Kenny, Patrick / Gupta, Vishwa / Boulianne, G. / Ouellet, Pierre / Dumouchel, Pierre (2006): "Feature normalization using smoothed mixture transformations", In INTERSPEECH-2006, paper 1026-Mon1A2O.1.