Attention Model for Articulatory Features Detection

Ievgen Karaulov, Dmytro Tkanov


Articulatory distinctive features, as well as phonetic transcription, play important role in speech-related tasks: computer-assisted pronunciation training, text-to-speech conversion (TTS), studying speech production mechanisms, speech recognition for low-resourced languages. End-to-end approaches to speech-related tasks got a lot of traction in recent years. We apply Listen, Attend and Spell (LAS) [1] architecture to phones recognition on a small small training set, like TIMIT [2]. Also, we introduce a novel decoding technique that allows to train manners and places of articulation detectors end-to-end using attention models. We also explore joint phones recognition and articulatory features detection in multitask learning setting.


 DOI: 10.21437/Interspeech.2019-3020

Cite as: Karaulov, I., Tkanov, D. (2019) Attention Model for Articulatory Features Detection. Proc. Interspeech 2019, 1571-1575, DOI: 10.21437/Interspeech.2019-3020.


@inproceedings{Karaulov2019,
  author={Ievgen Karaulov and Dmytro Tkanov},
  title={{Attention Model for Articulatory Features Detection}},
  year=2019,
  booktitle={Proc. Interspeech 2019},
  pages={1571--1575},
  doi={10.21437/Interspeech.2019-3020},
  url={http://dx.doi.org/10.21437/Interspeech.2019-3020}
}