ISCA Archive ICSLP 1998
ISCA Archive ICSLP 1998

A practical perceptual frequency autoregressive HMM enhancement system

Beth Logan, Tony Robinson

We have previously developed an adaptive speech enhancement scheme. This models speech and noise using perceptual frequency or `warped' autoregressive HMMs (AR-HMMs) and estimates the clean speech and noise parameters within this framework. In this paper, we investigate the use of our system as a front-end to a clean MFCC recognition system. We make two main modifications to our scheme. First, we use MMSE spectral rather than time domain estimators for enhancement. Second, for computational reasons, we form estimators using non-warped AR-HMMs. To avoid mismatch when converting between warped and non-warped models, we use parallel models. Results are presented for small and medium vocabulary tasks. On the simple task, we approach the performance of a matched system when language model information is included. On the second task, we are unable to incorporate a language model due to modelling deficiencies in AR-HMMs. However, we still demonstrate substantial improvements over baseline results.


doi: 10.21437/ICSLP.1998-349

Cite as: Logan, B., Robinson, T. (1998) A practical perceptual frequency autoregressive HMM enhancement system. Proc. 5th International Conference on Spoken Language Processing (ICSLP 1998), paper 1083, doi: 10.21437/ICSLP.1998-349

@inproceedings{logan98_icslp,
  author={Beth Logan and Tony Robinson},
  title={{A practical perceptual frequency autoregressive HMM enhancement system}},
  year=1998,
  booktitle={Proc. 5th International Conference on Spoken Language Processing (ICSLP 1998)},
  pages={paper 1083},
  doi={10.21437/ICSLP.1998-349}
}