5th International Conference on Spoken Language Processing
Phonetic vocoding is one of the methods for coding speech below 1000 bit/s. The transmitter stage includes a phone recogniser whose index is transmitted together with prosodic information such as duration, energy and pitch variation. This type of coder does not transmit spectral speaker characteristics and speaker recognisability thus becomes a major problem. In our previous work, we adapted a speaker modification strategy to minimise this problem, modifying a codebook to match the spectral characteristics of the input speaker. This is done at the cost of transmitting the LSP averages computed for vowel and glide phones. This paper presents new codebook generation strategies, with gender dependence and interpolation frames, that lead to better speaker recognisability and speech quality. Relatively to our previous work, some effort was also devoted to deriving more efficient quantization methods for the speaker-specific information , that considerably reduced the average bit rate, without quality degradation.
Sound Examples: #1 #2 #3 #4 #5
Bibliographic reference. Ribeiro, Carlos M. / Trancoso, Isabel M. (1998): "Improving speaker recognisability in phonetic vocoders", In ICSLP-1998, paper 0448.