Automatic speaker nativeness assessment has multiple applications, such as second language learning and IVR systems. In this paper we view this as a regression problem, since the available labels are on a continuous scale. Multiple approaches were applied, such as phonotactic models, i-vectors, and goodness of pronunciation, covering both segmental and suprasegmental features. Different phonotactic models were adopted, either trained with the challenge data, or using additional multilingual data from other domains. The obtained values were later combined in multiple ways and fed to a support vector machine regressor. Results on the test set surpass the provided baseline and are in line with the results obtained on the remaining sets. This suggests that our models generalize well to other datasets.
Bibliographic reference. Ribeiro, Eugénio / Ferreira, Jaime / Olcoz, Julia / Abad, Alberto / Moniz, Helena / Batista, Fernando / Trancoso, Isabel (2015): "Combining multiple approaches to predict the degree of nativeness", In INTERSPEECH-2015, 488-492.