16th Annual Conference of the International Speech Communication Association

Dresden, Germany
September 6-10, 2015

Frequency Offset Correction in Single Sideband (SSB) Speech by Deep Neural Network for Speaker Verification

Hua Xing, Gang Liu, John H. L. Hansen

University of Texas at Dallas, USA

Communication system mismatch represents a major influence for loss in speaker recognition performance. This paper considers a type of nonlinear communication system mismatch- modulation/ demodulation (Mod/DeMod) carrier drift in single sideband (SSB) speech signals. We focus on the problem of estimating frequency offset in SSB speech in order to improve speaker verification performance of the drifted speech. Based on a two-step framework from previous work, we propose using a multi-layered neural network architecture, stacked denoising autoencoder (SDA), to determine the unique interval of the offset value in the first step. Experimental results demonstrate that the SDA based system can produce up to a +16.1% relative improvement in frequency offset estimation accuracy. A speaker verification evaluation shows a +65.9% relative improvement in EER when SSB speech signal is compensated with the frequency offset value estimated by the proposed method.

Full Paper

Bibliographic reference.  Xing, Hua / Liu, Gang / Hansen, John H. L. (2015): "Frequency offset correction in single sideband (SSB) speech by deep neural network for speaker verification", In INTERSPEECH-2015, 1156-1160.