In this paper, we propose a Bark-scale shift based piecewise nonlinear warping function for speaker normalization, and a joint frequency discontinuity and energy attenuation detection algorithm to estimate the second subglottal resonance (Sg2). We then apply Sg2 for rapid speaker normalization. Experimental results on children’s speech recognition show that the proposed nonlinear warping function is more effective for speaker normalization than linear frequency warping. Compared to maximum likelihood based grid search methods, Sg2 normalization is more efficient and achieves comparable or better performance, especially for limited normalization data.
Bibliographic reference. Wang, Shizhen / Lee, Yi-Hui / Alwan, Abeer (2009): "Bark-shift based nonlinear speaker normalization using the second subglottal resonance", In INTERSPEECH-2009, 1619-1622.