16th Annual Conference of the International Speech Communication Association

Dresden, Germany
September 6-10, 2015

Exploring ANN Back-Ends for i-Vector Based Speaker Age Estimation

Anna Fedorova (1), Ondřej Glembek (2), Tomi Kinnunen (1), Pavel Matějka (2)

(1) University of Eastern Finland, Finland
(2) Brno University of Technology, Czech Republic

We address the problem of speaker age estimation using i-vectors. We first compare different i-vector extraction setups and then focus on (shallow) artificial neural net (ANN) back-ends. We explore ANN architecture, training algorithm and ANN ensembles. The results on NIST 2008 and 2010 SRE data indicate that, after extensive parameter optimization, ANN back-end in combination with i-vectors reaches mean absolute errors (MAEs) of 5.49 (females) and 6.35 (males), which are 4.5% relative improvement in comparison to our support-vector regression (SVR) baseline. Hence, the choice of back-end did not affect the accuracy much; a suggested future direction is therefore focusing more on front-end processing.

Full Paper

Bibliographic reference.  Fedorova, Anna / Glembek, Ondřej / Kinnunen, Tomi / Matějka, Pavel (2015): "Exploring ANN back-ends for i-vector based speaker age estimation", In INTERSPEECH-2015, 3036-3040.