An efficient speaker-normalization method based on the mapping of two self-organizing feature maps is developed. The normalization system consists of a reference map that is trained on the reference speaker's feature space and of a test speaker's map, generated by a special topology-maintaining retraining of the reference map. The retraining procedure is called 'Forced Competitive Learning (FCL)\ It ensures the topological identity of both maps and thereby implicitly establishes an 1:1 correspondency of the codebooks. This allows for an l:l-exchange of the feature vectors represented by the neurons of the reference map for those of the test map in the operation phase. Pilot tests on a 33 word database, including the 10 digits (3 male & 2 female speakers, 5 versions each) have been performed employing a simple HMM-isolated-word recognizer. The evaluation was based on speaker-dependent recognition and has shown an average adaptation efficiency of p = 0, 90 . Because of its error tolerance and its applicability to virtually every, even abstract feature space, the method proposed can broadly be applied as a front end to all kinds of VQ-based recognition systems.
Bibliographic reference. Knohl, Lars / Rinscheid, Ansgar (1993): "Speaker normalization and adaptation based on feature-map projection", In EUROSPEECH'93, 367-370.