Investigating Adaptation and Transfer Learning for End-to-End Spoken Language Understanding from Speech

Natalia Tomashenko, Antoine Caubrière, Yannick Estève


This work investigates speaker adaptation and transfer learning for spoken language understanding (SLU). We focus on the direct extraction of semantic tags from the audio signal using an end-to-end neural network approach. We demonstrate that the learning performance of the target predictive function for the semantic slot filling task can be substantially improved by speaker adaptation and by various knowledge transfer approaches. First, we explore speaker adaptive training (SAT) for end-to-end SLU models and propose to use zero pseudo i-vectors for more efficient model initialization and pretraining in SAT. Second, in order to improve the learning convergence for the target semantic slot filling (SF) task, models trained for different tasks, such as automatic speech recognition and named entity extraction are used to initialize neural end-to-end models trained for the target task. In addition, we explore the impact of the knowledge transfer for SLU from a speech recognition task trained in a different language. These approaches allow to develop end-to-end SLU systems in low-resource data scenarios when there is no enough in-domain semantically labeled data, but other resources, such as word transcriptions for the same or another language or named entity annotation, are available.


 DOI: 10.21437/Interspeech.2019-2158

Cite as: Tomashenko, N., Caubrière, A., Estève, Y. (2019) Investigating Adaptation and Transfer Learning for End-to-End Spoken Language Understanding from Speech. Proc. Interspeech 2019, 824-828, DOI: 10.21437/Interspeech.2019-2158.


@inproceedings{Tomashenko2019,
  author={Natalia Tomashenko and Antoine Caubrière and Yannick Estève},
  title={{Investigating Adaptation and Transfer Learning for End-to-End Spoken Language Understanding from Speech}},
  year=2019,
  booktitle={Proc. Interspeech 2019},
  pages={824--828},
  doi={10.21437/Interspeech.2019-2158},
  url={http://dx.doi.org/10.21437/Interspeech.2019-2158}
}