A Novel Normalization Method for Autocorrelation Function for Pitch Detection and for Speech Activity Detection

Qiguang Lin, Yiwen Shao


Autocorrelation functions (ACF) have been used in various pitch detection algorithms (PDA) and voicing-feature based speech activity detection (SAD) techniques. Speech is assumed to be stationary over a short-term window and a Hanning window is typically applied in the calculation of ACF. As a result of windowing, the ACF tapers as the autocorrelation lags increase. Boersma demonstrated that the tapering effect could be compensated for by dividing the ACF of the windowed signal by the autocorrelation of the windowing function itself, referred to as wACF hereafter. We recently found that wACF could cause overcompensation and therefore, result in errors in pitch detection. In this paper, a novel normalization method, eACF, is proposed that can both mitigate the tapering effect and minimize the overcompensation. The new method is evaluated on synthetic speech and on the TIMIT database with various types of additive noise at different signal-to-noise (SNR) ratios. The results show that the new method leads to better performance both in terms of pitch detection and speech activity detection. In this paper, we also investigate the scenarios where applying the wACF method is advantageous and where it is not.


 DOI: 10.21437/Interspeech.2018-45

Cite as: Lin, Q., Shao, Y. (2018) A Novel Normalization Method for Autocorrelation Function for Pitch Detection and for Speech Activity Detection. Proc. Interspeech 2018, 2097-2101, DOI: 10.21437/Interspeech.2018-45.


@inproceedings{Lin2018,
  author={Qiguang Lin and Yiwen Shao},
  title={A Novel Normalization Method for Autocorrelation Function for Pitch Detection and for Speech Activity Detection},
  year=2018,
  booktitle={Proc. Interspeech 2018},
  pages={2097--2101},
  doi={10.21437/Interspeech.2018-45},
  url={http://dx.doi.org/10.21437/Interspeech.2018-45}
}