Sixth International Conference on Spoken Language Processing
Robust speech recognition via hidden Markov model- ing of spectral vectors is studied in this paper. The hid- den Markov model (HMM) mixture components are as- sumed complex Gaussian with zero mean, diagonal co- variance, and with incorporating an unknown scalar gain term. The gain term is associated with each spectral vec- tor and it models the varying energy of speech signals. It is estimated by applying the maximum likelihood (ML) criterion. On an isolated digit database, in clean condi- tions, the spectral modeling with ML gain estimation ap- proach achieved similar performance to cepstral modeling of speech.
Two additive noise compensation approaches for the spectral modeling scheme are also considered. The first approach requires a full noise HMM. This HMM is com- bined with the clean speech HMM to yield a noisy speech HMM. The second approach requires only the spectral shape of the noise. A term dependent on the spectral shape, together with an unknown magnitude term, is in- corporated into the clean speech HMM to yield a noisy speech HMM. The unknown magnitude of the noise is es- timated via the ML criterion. The performance of these two approaches for isolated digit recognition in noise is demonstrated and compared to a robust cepstral model- ing approach from the literature.
Bibliographic reference. Roberts, William J.J. / Furui, Sadaoki (2000): "Robust speech recognition via modeling spectral coefficients with HMM's with complex Gaussian components", In ICSLP-2000, vol.4, 484-487.