ISCA Archive Interspeech 2013
ISCA Archive Interspeech 2013

A two-step technique for MRI audio enhancement using dictionary learning and wavelet packet analysis

Colin Vaz, Vikram Ramanarayanan, Shrikanth Narayanan

We present a method for speech enhancement of data collected in extremely noisy environments, such as those found during magnetic resonance imaging (MRI) scans. We propose a two-step algorithm to perform this noise suppression. First, we use probabilistic latent component analysis to learn dictionaries of the noise and speech+noise portions of the data and use these to factor the noisy spectrum into estimated speech and noise components. Second, we apply a wavelet packet analysis in conjunction with a wavelet threshold that minimizes the KL divergence between the estimated speech and noise to achieve further noise suppression. Based on both objective and subjective assessments, we find that our algorithm significantly outperforms traditional techniques such as nLMS, while not requiring prior knowledge or periodicity of the noise waveforms that current state-of-the-art algorithms require.


doi: 10.21437/Interspeech.2013-349

Cite as: Vaz, C., Ramanarayanan, V., Narayanan, S. (2013) A two-step technique for MRI audio enhancement using dictionary learning and wavelet packet analysis. Proc. Interspeech 2013, 1312-1315, doi: 10.21437/Interspeech.2013-349

@inproceedings{vaz13_interspeech,
  author={Colin Vaz and Vikram Ramanarayanan and Shrikanth Narayanan},
  title={{A two-step technique for MRI audio enhancement using dictionary learning and wavelet packet analysis}},
  year=2013,
  booktitle={Proc. Interspeech 2013},
  pages={1312--1315},
  doi={10.21437/Interspeech.2013-349}
}