EUROSPEECH 2003 - INTERSPEECH 2003
In Missing Feature Theory (MFT), it is assumed that some of the features that are extracted from an observation are missing or unreliable. Applied to spectral features for noisy speech recognition, the clean feature values are known to be less than the observed noisy features. Based on this inequality constraint, an HMM-state-dependent clean speech value of the missing features can be inferred through maximum likelihood estimation. This paper describes two observed biases of the likelihood evaluated at the estimate. Theoretical and experimental evidence are provided that an upper bound on the accuracy is improved by applying computationally simple corrections for the number of free variables in the likelihood maximization and for the global acoustic space density function.
Bibliographic reference. Hamme, Hugo van (2003): "Two correction models for likelihoods in robust speech recognition using missing feature theory", In EUROSPEECH-2003, 3073-3076.