ISCA Archive Interspeech 2009
ISCA Archive Interspeech 2009

Speech generation from hand gestures based on space mapping

Aki Kunikoshi, Yu Qiao, Nobuaki Minematsu, Keikichi Hirose

Individuals with speaking disabilities, particularly people suffering from dysarthria, often use a TTS synthesizer for speech communication. Since users always have to type sound symbols and the synthesizer reads them out in a monotonous style, the use of the current synthesizers usually renders real-time operation and lively communication difficult. This is why dysarthric users often fail to control the flow of conversation. In this paper, we propose a novel speech generation framework which makes use of hand gestures as input. People usually use tongue gesture transitions for speech generation but we develop a special glove, by wearing which, speech sounds are generated from hand gesture transitions. For development, GMM-based voice conversion techniques (mapping techniques) are applied to estimate a mapping function between a space of hand gestures and another space of speech sounds. In this paper, as an initial trial, a mapping between hand gestures and Japanese vowel sounds is estimated so that topological features of the selected gestures in a feature space and those of the five Japanese vowels in a cepstrum space are equalized. Experiments show that the special glove can generate good Japanese vowel transitions with voluntary control of duration and articulation.

doi: 10.21437/Interspeech.2009-102

Cite as: Kunikoshi, A., Qiao, Y., Minematsu, N., Hirose, K. (2009) Speech generation from hand gestures based on space mapping. Proc. Interspeech 2009, 308-311, doi: 10.21437/Interspeech.2009-102

  author={Aki Kunikoshi and Yu Qiao and Nobuaki Minematsu and Keikichi Hirose},
  title={{Speech generation from hand gestures based on space mapping}},
  booktitle={Proc. Interspeech 2009},