ISCA Archive Interspeech 2005
ISCA Archive Interspeech 2005

A framework for estimation of clean speech by fusion of outputs from multiple speech enhancement systems

Venkatesh Krishnan, Phil S. Whitehead, David V. Anderson, Mark A. Clements

A novel multiple-input Kalman filtering (MIKF) framework is presented that estimates the clean speech signal by fusion of outputs from multiple speech enhancement systems. The MIKF framework generates a sample-by-sample minimum mean-square error estimate of the clean speech signal from these outputs. The residual noise in each input to the MIKF is modeled as an autoregressive (AR) process so that non-white noise can be accommodated, and the noise model is dynamically updated to handle non-stationary noise. Speech is also modeled as an AR process whose parameters are estimated from a codebook of suitably designed prototype AR parameters. Constraining the AR parameters via a codebook improves the quality and makes it easy to integrate the MIKF system with a speech coder. The proposed framework also has the flexibility to apply user-defined, heuristic weights to the inputs to the MIKF, which are the outputs of the contributing speech enhancement systems. Perceptual quality tests and objective measures (segmental signal-to-noise ratio) both demonstrate that the estimate of the clean speech signal generated by the MIKF is superior to any of its inputs.


doi: 10.21437/Interspeech.2005-740

Cite as: Krishnan, V., Whitehead, P.S., Anderson, D.V., Clements, M.A. (2005) A framework for estimation of clean speech by fusion of outputs from multiple speech enhancement systems. Proc. Interspeech 2005, 2317-2320, doi: 10.21437/Interspeech.2005-740

@inproceedings{krishnan05_interspeech,
  author={Venkatesh Krishnan and Phil S. Whitehead and David V. Anderson and Mark A. Clements},
  title={{A framework for estimation of clean speech by fusion of outputs from multiple speech enhancement systems}},
  year=2005,
  booktitle={Proc. Interspeech 2005},
  pages={2317--2320},
  doi={10.21437/Interspeech.2005-740}
}