ISCA Archive ICSLP 1998
ISCA Archive ICSLP 1998

A sinusoidal harmonic vocoder at 1.2 kbps using auditory perceptual characteristics

Minoru Kohata

In this paper, a very low bit speech coder at 1.2 Ops is newly proposed. Like the LPC vocoder, it requires few types of information (power, pitch, and spectral information), but its quality is far superior. In the proposed vocoder, the synthesized speech quality is improved based on auditory perceptual characteristics. The synthesis method is one of harmonic coding, using sinusoids whose frequencies are multiples of the fundamental frequency, where the amplitudes of the sinusoids, are adaptively modulated using Gammatone filters as a perceptual weighting filter. The sinusoids' phases are also adjusted so as to maximize the perceptual quality. In order to reduce the total bit rate to 1.2 Ops, a new segment coder for spectral information (LSP coefficients) using DP matching is also proposed. The quality of the synthesized speech is considerably improved compared with that of the simple I-PC vocoder, according to MOS and preference tests.


doi: 10.21437/ICSLP.1998-388

Cite as: Kohata, M. (1998) A sinusoidal harmonic vocoder at 1.2 kbps using auditory perceptual characteristics. Proc. 5th International Conference on Spoken Language Processing (ICSLP 1998), paper 0037, doi: 10.21437/ICSLP.1998-388

@inproceedings{kohata98_icslp,
  author={Minoru Kohata},
  title={{A sinusoidal harmonic vocoder at 1.2 kbps using auditory perceptual characteristics}},
  year=1998,
  booktitle={Proc. 5th International Conference on Spoken Language Processing (ICSLP 1998)},
  pages={paper 0037},
  doi={10.21437/ICSLP.1998-388}
}