ISCA Archive Interspeech 2013
ISCA Archive Interspeech 2013

An investigation of temporally varying weight regression for noise robust speech recognition

Shilin Liu, Khe Chai Sim

In this paper, recently proposed Temporally Varying Weight Regression (TVWR) is investigated in two ways for noise robust speech recognition. Firstly, since typical model compensation approaches assume that the noise feature is independent and identically distributed, non-stationary noise environment can be poorly compensated using conventional model compensation approaches in the standard Hidden Markov Model (HMM) framework. TVWR, however, maintains both the basic HMM structure and additional time-varying property, therefore, model compensation for TVWR is proposed such that i.i.d. noise assumption can be relaxed. Secondly, although Noise Adaptive Training NAT has been proposed to optimize the "pseudo-clean" HMM model for a better performance by maximizing the likelihood of multi-condition data, NAT heavily depends on the simplicity of Vector Taylor Series (VTS) formulation. Hence, other advanced compensation approaches, such as Trajectory-based Parallel Model Combination (TPMC), have difficulties benefiting from this powerful training schema. This paper exploits the time-varying attribute of TVWR to approximate NAT such that any compensation technique can be applied during noise adaptive training. Experiments on the Aurora 4 corpus show that significant improvements over the standard HMM or NAT system can be obtained by compensating TVWR either trained using clean data or adaptively trained using multi-condition data.


doi: 10.21437/Interspeech.2013-270

Cite as: Liu, S., Sim, K.C. (2013) An investigation of temporally varying weight regression for noise robust speech recognition. Proc. Interspeech 2013, 2963-2967, doi: 10.21437/Interspeech.2013-270

@inproceedings{liu13b_interspeech,
  author={Shilin Liu and Khe Chai Sim},
  title={{An investigation of temporally varying weight regression for noise robust speech recognition}},
  year=2013,
  booktitle={Proc. Interspeech 2013},
  pages={2963--2967},
  doi={10.21437/Interspeech.2013-270}
}