ISCA Archive Interspeech 2015
ISCA Archive Interspeech 2015

Exploring ANN back-ends for i-vector based speaker age estimation

Anna Fedorova, Ondřej Glembek, Tomi Kinnunen, Pavel Matějka

We address the problem of speaker age estimation using i-vectors. We first compare different i-vector extraction setups and then focus on (shallow) artificial neural net (ANN) back-ends. We explore ANN architecture, training algorithm and ANN ensembles. The results on NIST 2008 and 2010 SRE data indicate that, after extensive parameter optimization, ANN back-end in combination with i-vectors reaches mean absolute errors (MAEs) of 5.49 (females) and 6.35 (males), which are 4.5% relative improvement in comparison to our support-vector regression (SVR) baseline. Hence, the choice of back-end did not affect the accuracy much; a suggested future direction is therefore focusing more on front-end processing.

doi: 10.21437/Interspeech.2015-103

Cite as: Fedorova, A., Glembek, O., Kinnunen, T., Matějka, P. (2015) Exploring ANN back-ends for i-vector based speaker age estimation. Proc. Interspeech 2015, 3036-3040, doi: 10.21437/Interspeech.2015-103

  author={Anna Fedorova and Ondřej Glembek and Tomi Kinnunen and Pavel Matějka},
  title={{Exploring ANN back-ends for i-vector based speaker age estimation}},
  booktitle={Proc. Interspeech 2015},