Investigating Objective Intelligibility in Real-Time EMG-to-Speech Conversion

Lorenz Diener, Tanja Schultz


This paper presents an analysis of the influence of various system parameters on the output quality of our neural network based real-time EMG-to-Speech conversion system. This EMG-to-Speech system allows for the direct conversion of facial surface electromyographic signals into audible speech in real time, allowing for a closed-loop setup where users get direct audio feedback. Such a setup opens new avenues for research and applications through co-adaptation approaches. In this paper, we evaluate the influence of several parameters on the output quality, such as time context, EMG-Audio delay, network-, training data- and Mel spectrogram size. The resulting output quality is evaluated based on the objective output quality measure STOI.


 DOI: 10.21437/Interspeech.2018-2080

Cite as: Diener, L., Schultz, T. (2018) Investigating Objective Intelligibility in Real-Time EMG-to-Speech Conversion. Proc. Interspeech 2018, 3162-3166, DOI: 10.21437/Interspeech.2018-2080.


@inproceedings{Diener2018,
  author={Lorenz Diener and Tanja Schultz},
  title={Investigating Objective Intelligibility in Real-Time EMG-to-Speech Conversion},
  year=2018,
  booktitle={Proc. Interspeech 2018},
  pages={3162--3166},
  doi={10.21437/Interspeech.2018-2080},
  url={http://dx.doi.org/10.21437/Interspeech.2018-2080}
}