7th International Conference on Spoken Language Processing
September 16-20, 2002
This paper presents a maximum likelihood (ML) approach, relative to the background model estimation, in noisy acoustic non-stationary environments. The external noise source is characterised by a time constant convolutional and a time varying additive components, which is consistent with the telephone channel. The HMM composition technique, provides a mechanism for integrating parametric models of acoustic background with the signal model, so that noise compensation is tightly coupled with the background model estimation. However, the existing continuous adaptation algorithms usually do not take advantage of this approach, being essentially based on the MLLR algorithm. Consequently, a model for environmental mismatch is not available and, even under constrained conditions a significant number of model parameters have to be updated. From a theoretical point of view only the noise model parameters need to be updated, being the clean speech ones unchanged by the environment. So, it can be advantageous to have a model for environmental mismatch. This approach was followed in the development of the algorithm proposed in this paper. One drawback sometimes attributed to the continuous adaptation approach is that recognition failures originate poor background estimates. This paper also proposes a MAP-like method to deal with this situation.
Bibliographic reference. Lima, Carlos / Almeida, Luís B. / Monteiro, João L. (2002): "Continuous environmental adaptation of a speech recogniser in telephone line conditions", In ICSLP-2002, 1401-1404.