Cross-Lingual Transfer Learning for Affective Spoken Dialogue Systems

Kristijan Gjoreski, Aleksandar Gjoreski, Ivan Kraljevski, Diane Hirschfeld


This paper presents a case study of cross-lingual transfer learning applied for affective computing in the domain of spoken dialogue systems. Prosodic features of correction dialog acts are modeled on a group of languages and compared with languages excluded from the analysis.

Speech from different languages was recorded in carefully staged Wizard-of-Oz experiments, however, without the possibility to ensure balanced distribution of speakers per language. In order to assess the possibility of cross-lingual transfer learning and to ensure reliable classification of corrections independently of language, we employed different machine learning approaches along with relevant acoustic-prosodic features sets.

The results of the experiments with mono-lingual corpora (trained and tested on a single language) and cross-lingual (trained on several languages and tested on the rest) were analyzed and compared in the terms of accuracy and F1 score.


 DOI: 10.21437/Interspeech.2019-2163

Cite as: Gjoreski, K., Gjoreski, A., Kraljevski, I., Hirschfeld, D. (2019) Cross-Lingual Transfer Learning for Affective Spoken Dialogue Systems. Proc. Interspeech 2019, 1916-1920, DOI: 10.21437/Interspeech.2019-2163.


@inproceedings{Gjoreski2019,
  author={Kristijan Gjoreski and Aleksandar Gjoreski and Ivan Kraljevski and Diane Hirschfeld},
  title={{Cross-Lingual Transfer Learning for Affective Spoken Dialogue Systems}},
  year=2019,
  booktitle={Proc. Interspeech 2019},
  pages={1916--1920},
  doi={10.21437/Interspeech.2019-2163},
  url={http://dx.doi.org/10.21437/Interspeech.2019-2163}
}