This paper presents a new pitch-based spectral enhancement algorithm on voiced frames for speech analysis and noise-robust speech processing. The proposed algorithm determines a time-warping function (TWF) and the speaker's pitch with high precision, simultaneously. This technique reduces the smearing effect in between harmonics when the fundamental frequency is not constant within the analysis window. To do so, we propose a metric called the harmonic residual which measures the difference between the actual spectrum and the resynthesized spectrum derived from the linear model of speech production with various combinations of TWF and high-precision pitch values as parameters. The TWF and pitch pair that yields the minimum harmonic residual is selected and the enhanced spectrum is obtained accordingly. We show how this new representation can be used for automatic speech recognition by proposing a robust spectral representation derived from harmonic amplitude interpolation.
Bibliographic reference. Kaewtip, Kantapon / Tan, Lee Ngee / Alwan, Abeer (2013): "A pitch-based spectral enhancement technique for robust speech processing", In INTERSPEECH-2013, 3284-3288.