Interspeech'2005 - Eurospeech
Generally, phonetic classification for low rate speech coding is restricted to either a simple binary voiced/unvoiced classification of entire speech frames, or alternatively, a more complicated estimation of the voicing for a set of frequency bands. A good compromise between these two techniques is estimation of a single cut-off frequency that separates the spectrum into voiced (below) and unvoiced (above) regions. Many existing cut-off frequency estimation methods use a fixed periodic spectrum to model voiced harmonics. However, due to pitch jitter, voiced harmonics do not always appear at regular intervals in the spectrum. In this paper a voicing cut-off estimation approach that combines speech energy, speech auto-correlation between frames and residual harmonic matching is proposed. Objective evaluation indicates that the algorithm is accurate and reliable. Subjective results obtained by embedding the algorithm in a low rate harmonic speech coder indicate that the technique is suitable for supporting high quality low rate speech synthesis. The proposed algorithm also requires relatively low complexity and introduces only a single frame of algorithmic delay.
Bibliographic reference. Bao, Changchun / Lukasiak, Jason / Ritz, Christian (2005): "A novel voicing cut-off determination for low bit-rate harmonic speech coding", In INTERSPEECH-2005, 2709-2712.