Noise-masking through the addition of a constant offset to the linear spectral estimates provides a feature space more stable to changes in noise statistics. This leads to performance equivalent to that achieved by explicit modelling. Experimental results show that the masking spectrum need not be identical to that of the noise, implying a reduced noise sensitivity generally. It is found that for very low SNR conditions (<9dB), masking yields better results than for explicit modelling.
Excessive masking levels leads to a convergence of performance for a range of SNR levels from clean to OdB and beyond, implying a stabilised feature space. However results are degraded for the clean case.
Cite as: Openshaw, J.P., Mason, J.S. (1994) Optimal noise-masking of cepstral features for robust speaker identification. Proc. ESCA Workshop on Automatic Speaker Recognition, Identification and Verification, 231-234
@inproceedings{openshaw94_asriv, author={J. P. Openshaw and J. S. Mason}, title={{Optimal noise-masking of cepstral features for robust speaker identification}}, year=1994, booktitle={Proc. ESCA Workshop on Automatic Speaker Recognition, Identification and Verification}, pages={231--234} }