12th Annual Conference of the International Speech Communication Association

Florence, Italy
August 27-31. 2011

Harmonic Structure Transform for Speaker Recognition

Kornel Laskowski (1), Qin Jin (2)

(1) KTH, Sweden
(2) Carnegie Mellon University, USA

We evaluate a new filterbank structure, yielding the harmonic structure cepstral coefficients (HSCCs), on a mismatched-session closed-set speaker classification task. The novelty of the filterbank lies in its averaging of energy at frequencies related by harmonicity rather than by adjacency. Improvements are presented which achieve a 37%rel reduction in error rate under these conditions. The improved features are combined with a similar Mel-frequency cepstral coefficient (MFCC) system to yield error rate reductions of 32%rel, suggesting that HSCCs offer information which is complimentary to that available to today's MFCC-based systems.

Full Paper

Bibliographic reference.  Laskowski, Kornel / Jin, Qin (2011): "Harmonic structure transform for speaker recognition", In INTERSPEECH-2011, 365-368.