ISCA Archive ICSLP 1998
ISCA Archive ICSLP 1998

Improving speaker recognisability in phonetic vocoders

Carlos M. Ribeiro, Isabel M. Trancoso

Phonetic vocoding is one of the methods for coding speech below 1000 bit/s. The transmitter stage includes a phone recogniser whose index is transmitted together with prosodic information such as duration, energy and pitch variation. This type of coder does not transmit spectral speaker characteristics and speaker recognisability thus becomes a major problem. In our previous work, we adapted a speaker modification strategy to minimise this problem, modifying a codebook to match the spectral characteristics of the input speaker. This is done at the cost of transmitting the LSP averages computed for vowel and glide phones. This paper presents new codebook generation strategies, with gender dependence and interpolation frames, that lead to better speaker recognisability and speech quality. Relatively to our previous work, some effort was also devoted to deriving more efficient quantization methods for the speaker-specific information , that considerably reduced the average bit rate, without quality degradation.


doi: 10.21437/ICSLP.1998-396

Cite as: Ribeiro, C.M., Trancoso, I.M. (1998) Improving speaker recognisability in phonetic vocoders. Proc. 5th International Conference on Spoken Language Processing (ICSLP 1998), paper 0448, doi: 10.21437/ICSLP.1998-396

@inproceedings{ribeiro98_icslp,
  author={Carlos M. Ribeiro and Isabel M. Trancoso},
  title={{Improving speaker recognisability in phonetic vocoders}},
  year=1998,
  booktitle={Proc. 5th International Conference on Spoken Language Processing (ICSLP 1998)},
  pages={paper 0448},
  doi={10.21437/ICSLP.1998-396}
}