The spectral parameters that result from filtering the frequency sequence of log mel-scaled filter-bank energies with a first or second order FIR filter have proved to be competitive for speech recognition. Recently, the authors have shown that this frequency filtering can approximately equalize the cepstrum variance enhancing the oscillations of the spectral envelope curve that are most effective for discrimination between speakers. Even better speaker identification results than using mel-cepstrum were observed on the TIMIT database, especially when white noise was added. In this paper, the hybridization of both linear prediction and filter-bank spectral analysis using either cepstral transformation or the alternative frequency filtering is explored for speaker verification. This combination, that had shown to be able to outperform the conventional techniques in clean and noisy word recognition, has yield good text-dependent speaker verification results on the new speaker-oriented telephone-line POLYCOST database.
Cite as: Hernando, J., Nadeu, C. (1998) Speaker verification on the polycost database using frequency filtered spectral energies. Proc. 5th International Conference on Spoken Language Processing (ICSLP 1998), paper 0724, doi: 10.21437/ICSLP.1998-212
@inproceedings{hernando98_icslp, author={Javier Hernando and Climent Nadeu}, title={{Speaker verification on the polycost database using frequency filtered spectral energies}}, year=1998, booktitle={Proc. 5th International Conference on Spoken Language Processing (ICSLP 1998)}, pages={paper 0724}, doi={10.21437/ICSLP.1998-212} }