Sixth European Conference on Speech Communication and Technology

Budapest, Hungary
September 5-9, 1999

Auditory Masking Threshold Estimation for Broadband Noise Sources with Application to Speech Enhancement

Ruhi Sarikaya, John H. L. Hansen

Robust Speech Processing Laboratory Center for Spoken Language Understanding, University of Colorado at Boulder, Boulder, CO, USA

This paper addresses issues encountered in the use of an Auditory Masking Threshold (AMT) for speech enhancement and proposes an algorithm to improve AMT estimation for broadband noise sources. We determined that while AMT estimation is fairly accurate, and hence an enhancement scheme based on AMT can suppress audible noise to a greater extent for low frequency colored noise sources, the algorithm fails to converge to the clean speech AMT for broadband communication channel noise. We propose a new AMT estimation scheme and incorporate the proposed algorithm into a previously developed enhancement framework [2].We evaluate our algorithm on a set of sentences obtained from the standard TIMIT database for at communications channel noise (FLN), and automobile highway noise (HWY) at 5 dB and 0 dB SNR levels, respectively. Evaluations were performed for 8 kHz and 16 kHz sampled speech and performance is measured with both objective and subjective assessment methods. The results show that the new AMT codebook based enhancement method is more effective than traditional AMT methods. Also, that traditional AMT methods may not be as effective for reduced bandwidth speech (4 kHz), or broadband interference, but that alternative AMT estimation methods can help improve convergence properties.

Full Paper (PDF)   Gnu-Zipped Postscript

Bibliographic reference.  Sarikaya, Ruhi / Hansen, John H. L. (1999): "Auditory masking threshold estimation for broadband noise sources with application to speech enhancement", In EUROSPEECH'99, 2571-2574.