A second order statistics spectrum estimation (SOSSE) method for speech enhancement is presented. DFT amplitude spectral components of noisy signal are assumed to be random values. Upon first and second order statistic values estimation of noise-only spectrum, an enhancement of noisy signal spectrum was performed. As a reference, a fast discrete cosine transform based signal subspace (FDCTSS) method was realized. The Aurora 2 database of digit sequences was used, to show methods effectiveness in improvement of speech recognition. Both methods proved well under clean training condition. The total relative improvements of 30.75% (SOSSE) and 26.31% (FDCSS) in recognition accuracy were achieved. When the multi-condition training was done the proposed SOSSE method outperformed FDCTSS method. The total relative improvements of 17.50% (SOSSE) and -4.53% (FDCTSS) were achieved.
Cite as: Jarc, B., Babic, R. (2001) Second order statistics spectrum estimation method for robust speech recognition. Proc. 7th European Conference on Speech Communication and Technology (Eurospeech 2001), 229-232, doi: 10.21437/Eurospeech.2001-80
@inproceedings{jarc01_eurospeech, author={Bojan Jarc and Rudolf Babic}, title={{Second order statistics spectrum estimation method for robust speech recognition}}, year=2001, booktitle={Proc. 7th European Conference on Speech Communication and Technology (Eurospeech 2001)}, pages={229--232}, doi={10.21437/Eurospeech.2001-80} }