8th Annual Conference of the International Speech Communication Association

Antwerp, Belgium
August 27-31, 2007

Formal Modelling of L1 and L2 Perceptual Learning: Computational Linguistics versus Machine Learning

Paola Escudero, Jelle Kastelein, Klara Weiand, R. J. J. H. van Son

University of Amsterdam, The Netherlands

In this paper, we evaluate the adequacy of two widely used machine learning algorithms and a computational linguistic proposal to model L2 perceptual development. The three proposals are, in order, Nearest Neighbor, Naive Bayesian and Stochastic OT and the Gradual Learning Algorithm. We compared the three models' outputs to those of Spanish learners of Dutch who were asked to categorize synthetic stimuli as one of the 12 Dutch vowels. The empirical results of the human learners show that L2 learners differ significantly from native listeners, but also that their perceptual spaces tend to become more native-like with L2 proficiency. The results of the simulations show that all three algorithms are able to model listeners' data to a certain extent but that Stochastic OT and the Gradual Learning Algorithm, i.e. the linguistic model, best reproduces L1 and L2 data.

Full Paper

Bibliographic reference.  Escudero, Paola / Kastelein, Jelle / Weiand, Klara / Son, R. J. J. H. van (2007): "Formal modelling of L1 and L2 perceptual learning: computational linguistics versus machine learning", In INTERSPEECH-2007, 1889-1892.