Most current techniques for near-end speech intelligibility enhancement have focused on processing clean input signals, however, in realistic environments, the input is often noisy. Processing noisy speech for intelligibility enhancement using algorithms developed for clean signals can lower the perceptual quality of the samples when they are listened in quiet. Here we address the quality loss in these conditions by combining noise reduction with a multi-band version of a state-of-the-art intelligibility enhancer for clean speech that is based on spectral shaping and dynamic range compression (SSDRC). Subjective quality and intelligibility assessments with noisy input speech showed that: (a) In quiet near-end conditions, the proposed system outperformed the baseline SSDRC in terms of Mean Opinion Score (MOS); (b) In speech-shaped near-end noise, the proposed system improved the intelligibility of unprocessed speech by a factor larger than three at the lowest tested signal-to-noise ratio (SNR) however, overall, it yielded lower recognition scores than the standard SSDRC.
Cite as: Zorilă, T.-C., Stylianou, Y. (2017) On the Quality and Intelligibility of Noisy Speech Processed for Near-End Listening Enhancement. Proc. Interspeech 2017, 2023-2027, doi: 10.21437/Interspeech.2017-1225
@inproceedings{zorila17_interspeech, author={Tudor-Cătălin Zorilă and Yannis Stylianou}, title={{On the Quality and Intelligibility of Noisy Speech Processed for Near-End Listening Enhancement}}, year=2017, booktitle={Proc. Interspeech 2017}, pages={2023--2027}, doi={10.21437/Interspeech.2017-1225} }