10th Annual Conference of the International Speech Communication Association

Brighton, United Kingdom
September 6-10, 2009

Bark-Shift Based Nonlinear Speaker Normalization Using the Second Subglottal Resonance

Shizhen Wang, Yi-Hui Lee, Abeer Alwan

University of California at Los Angeles, USA

In this paper, we propose a Bark-scale shift based piecewise nonlinear warping function for speaker normalization, and a joint frequency discontinuity and energy attenuation detection algorithm to estimate the second subglottal resonance (Sg2). We then apply Sg2 for rapid speaker normalization. Experimental results on children’s speech recognition show that the proposed nonlinear warping function is more effective for speaker normalization than linear frequency warping. Compared to maximum likelihood based grid search methods, Sg2 normalization is more efficient and achieves comparable or better performance, especially for limited normalization data.

Full Paper

Bibliographic reference.  Wang, Shizhen / Lee, Yi-Hui / Alwan, Abeer (2009): "Bark-shift based nonlinear speaker normalization using the second subglottal resonance", In INTERSPEECH-2009, 1619-1622.