INTERSPEECH 2010
11th Annual Conference of the International Speech Communication Association

Makuhari, Chiba, Japan
September 26-30. 2010

On Using Voice Source Measures in Automatic Gender Classification of Children's Speech

Gang Chen, Xue Feng, Yen-Liang Shue, Abeer Alwan

University of California at Los Angeles, USA

Acoustic characteristics of speech signals differ with gender due to physiological differences of the glottis and the vocal tract. Previous research [1] showed that adding the voice-source related measures H1*-H2* and H1*-A3* improved gender classification accuracy compared to using only the fundamental frequency (F0) and formant frequencies. Hi* refers to the i-th source spectral harmonic magnitude, and Ai* refers to the magnitude of the source spectrum at the i-th formant. In this paper, three other voice source related measures: CPP, HNR and H2*-H4* are used in gender classification of children's voices. CPP refers to the Cepstral Peak Prominence, HNR refers to the harmonic-to-noise ratio, and H2*-H4* refers to the difference between the 2nd and the 4th source spectral harmonic magnitudes. Results show that using these three features improves gender classification accuracy compared with [1].

Reference

  1. Y.-L. Shue and M. Iseli, “The role of voice source measures on automatic gender classification,” in Proceedings of ICASSP, 2008, pp. 4493–4496.

Full Paper

Bibliographic reference.  Chen, Gang / Feng, Xue / Shue, Yen-Liang / Alwan, Abeer (2010): "On using voice source measures in automatic gender classification of children's speech", In INTERSPEECH-2010, 673-676.