We investigate methods of improving the intelligibility of synthetic speech under noisy or low-fidelity acoustic conditions. Techniques explored improve speech in a natural manner, such that training wont be required for the user to understand the enhanced speech. While the improvements are natural in this respect, the changes arent limited to creating only speech that is achievable by a human vocal tract. Modifications fall into three broad classes: increasing phoneme amplitude, altering spectral shape, and lengthening phoneme duration. Listening tests conducted in noisy and noise-free conditions demonstrate significant improvements to intelligibility for most of the subject phonemes.
Cite as: Pan, D., Heng, B., Cheung, S., Chang, E. (2000) Improving speech synthesis for high intelligibility under adverse conditions. Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000), vol. 1, 721-724
@inproceedings{pan00_icslp, author={Davis Pan and Brian Heng and Shiufun Cheung and Ed Chang}, title={{Improving speech synthesis for high intelligibility under adverse conditions}}, year=2000, booktitle={Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000)}, pages={vol. 1, 721-724} }