Sixth International Conference on Spoken Language Processing
(ICSLP 2000)

Beijing, China
October 16-20, 2000

Robust Speech Recognition Via Modeling Spectral Coefficients with HMM's with Complex Gaussian Components

William J.J. Roberts (1), Sadaoki Furui (2)

(1) Defence Science Technology Organisation, Information Technology Division, Salisbury, Australia
(2) Department of Computer Science, Tokyo Institute of Technology, Meguro-ku, Tokyo, Japan

Robust speech recognition via hidden Markov model- ing of spectral vectors is studied in this paper. The hid- den Markov model (HMM) mixture components are as- sumed complex Gaussian with zero mean, diagonal co- variance, and with incorporating an unknown scalar gain term. The gain term is associated with each spectral vec- tor and it models the varying energy of speech signals. It is estimated by applying the maximum likelihood (ML) criterion. On an isolated digit database, in clean condi- tions, the spectral modeling with ML gain estimation ap- proach achieved similar performance to cepstral modeling of speech.

Two additive noise compensation approaches for the spectral modeling scheme are also considered. The first approach requires a full noise HMM. This HMM is com- bined with the clean speech HMM to yield a noisy speech HMM. The second approach requires only the spectral shape of the noise. A term dependent on the spectral shape, together with an unknown magnitude term, is in- corporated into the clean speech HMM to yield a noisy speech HMM. The unknown magnitude of the noise is es- timated via the ML criterion. The performance of these two approaches for isolated digit recognition in noise is demonstrated and compared to a robust cepstral model- ing approach from the literature.

Full Paper

Bibliographic reference.  Roberts, William J.J. / Furui, Sadaoki (2000): "Robust speech recognition via modeling spectral coefficients with HMM's with complex Gaussian components", In ICSLP-2000, vol.4, 484-487.