In this paper, a novel approach is introduced for performing real-time speech modulation enhancement to increase speech intelligibility in noise. The proposed modulation enhancement technique operates independently in the frequency and time domains. In the frequency domain, a compression function is used to perform energy reallocation within a frame. This compression function contains novel scaling operations to ensure speech quality. In the time domain, a mathematical equation is introduced to reallocate energy from the louder to the quieter parts of the speech. This proposed mathematical equation ensures that the long-term energy of the speech is preserved independently of the amount of compression, hence gaining full control of the time-energy reallocation in real-time. Evaluations on intelligibility and quality show that the suggested approach increases the intelligibility of speech while maintaining the overall energy and quality of the speech signal.
Cite as: Koutsogiannaki, M., Francois, H., Choo, K., Oh, E. (2017) Real-Time Modulation Enhancement of Temporal Envelopes for Increasing Speech Intelligibility. Proc. Interspeech 2017, 1973-1977, doi: 10.21437/Interspeech.2017-1157
@inproceedings{koutsogiannaki17_interspeech, author={Maria Koutsogiannaki and Holly Francois and Kihyun Choo and Eunmi Oh}, title={{Real-Time Modulation Enhancement of Temporal Envelopes for Increasing Speech Intelligibility}}, year=2017, booktitle={Proc. Interspeech 2017}, pages={1973--1977}, doi={10.21437/Interspeech.2017-1157} }