Robust speech recognition via hidden Markov model- ing of spectral vectors is studied in this paper. The hid- den Markov model (HMM) mixture components are as- sumed complex Gaussian with zero mean, diagonal co- variance, and with incorporating an unknown scalar gain term. The gain term is associated with each spectral vec- tor and it models the varying energy of speech signals. It is estimated by applying the maximum likelihood (ML) criterion. On an isolated digit database, in clean condi- tions, the spectral modeling with ML gain estimation ap- proach achieved similar performance to cepstral modeling of speech.
Two additive noise compensation approaches for the spectral modeling scheme are also considered. The first approach requires a full noise HMM. This HMM is com- bined with the clean speech HMM to yield a noisy speech HMM. The second approach requires only the spectral shape of the noise. A term dependent on the spectral shape, together with an unknown magnitude term, is in- corporated into the clean speech HMM to yield a noisy speech HMM. The unknown magnitude of the noise is es- timated via the ML criterion. The performance of these two approaches for isolated digit recognition in noise is demonstrated and compared to a robust cepstral model- ing approach from the literature.
Cite as: Roberts, W.J.J., Furui, S. (2000) Robust speech recognition via modeling spectral coefficients with HMM's with complex Gaussian components. Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000), vol. 4, 484-487, doi: 10.21437/ICSLP.2000-854
@inproceedings{roberts00_icslp, author={William J.J. Roberts and Sadaoki Furui}, title={{Robust speech recognition via modeling spectral coefficients with HMM's with complex Gaussian components}}, year=2000, booktitle={Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000)}, pages={vol. 4, 484-487}, doi={10.21437/ICSLP.2000-854} }