ISCA Archive Odyssey 2012
ISCA Archive Odyssey 2012

Linear prediction modulation filtering for speaker recognition of reverberant speech

Bengt Jonas Borgström, Alan McCree

This paper proposes a framework for spectral enhancement of reverberant speech based on inversion of the modulation transfer function. All-pole modeling of modulation spectra of clean and degraded speech are utilized to derive the linear prediction inverse modulation transfer function (LP-IMTF) solution as a low-order IIR filter in the modulation envelope domain. By considering spectral estimation under speech presence uncertainty, speech presence probabilities are derived for the case of reverberation. Aside from enhancement, the LP-IMTF framework allows for blind estimation of reverberation time by extracting a minimum phase approximation of the short-time spectral channel impulse response. The proposed speech enhancement method is used as a front-end processing step for speaker recognition. When applied to the microphone condition of the NISTSRE 2010 with artificially added reverberation, the proposed spectral enhancement method yields significant improvements across a variety of performance metrics.


Cite as: Borgström, B.J., McCree, A. (2012) Linear prediction modulation filtering for speaker recognition of reverberant speech. Proc. The Speaker and Language Recognition Workshop (Odyssey 2012), 187-193

@inproceedings{borgstrom12_odyssey,
  author={Bengt Jonas Borgström and Alan McCree},
  title={{Linear prediction modulation filtering for speaker recognition of reverberant speech}},
  year=2012,
  booktitle={Proc. The Speaker and Language Recognition Workshop (Odyssey 2012)},
  pages={187--193}
}