In this paper, we propose a Bark-scale shift based piecewise nonlinear warping function for speaker normalization, and a joint frequency discontinuity and energy attenuation detection algorithm to estimate the second subglottal resonance (Sg2). We then apply Sg2 for rapid speaker normalization. Experimental results on childrens speech recognition show that the proposed nonlinear warping function is more effective for speaker normalization than linear frequency warping. Compared to maximum likelihood based grid search methods, Sg2 normalization is more efficient and achieves comparable or better performance, especially for limited normalization data.
Cite as: Wang, S., Lee, Y.-H., Alwan, A. (2009) Bark-shift based nonlinear speaker normalization using the second subglottal resonance. Proc. Interspeech 2009, 1619-1622, doi: 10.21437/Interspeech.2009-212
@inproceedings{wang09b_interspeech, author={Shizhen Wang and Yi-Hui Lee and Abeer Alwan}, title={{Bark-shift based nonlinear speaker normalization using the second subglottal resonance}}, year=2009, booktitle={Proc. Interspeech 2009}, pages={1619--1622}, doi={10.21437/Interspeech.2009-212} }