11th Annual Conference of the International Speech Communication Association

Makuhari, Chiba, Japan
September 26-30. 2010

The Use of Air-Pressure Sensor in Electrolaryngeal Speech Enhancement Based on Statistical Voice Conversion

Keigo Nakamura, Tomoki Toda, Hiroshi Saruwatari, Kiyohiro Shikano

NAIST, Japan

In our previous work, we proposed a speaking-aid system converting electrolaryngeal speech (EL speech) to normal speech using a statistical voice conversion technique. The main weakness of our system is the difficulty of estimating natural contours of the fundamental frequency (F0) from EL speech including only built-in F0 contours. This paper proposes another speaking-aid system with an air-pressure sensor to enable laryngectomees to control F0 contours of the EL speech using their breathing air. The experimental result demonstrates that 1) the correlation coefficient of F0 contours between the converted and the target speech is improved from 0.58 to 0.78 by the use of the air-pressure sensor and 2) the synthetic speech converted by the proposed system sounds more natural and is more preferred to that by our conventional aid system.

Full Paper

Bibliographic reference.  Nakamura, Keigo / Toda, Tomoki / Saruwatari, Hiroshi / Shikano, Kiyohiro (2010): "The use of air-pressure sensor in electrolaryngeal speech enhancement based on statistical voice conversion", In INTERSPEECH-2010, 1628-1631.