5th International Conference on Spoken Language Processing
This paper describes a speaker adaptation technique for a phonetic vocoder based on HMM. In the vocoder, the encoder performs phoneme recognition and transmits phoneme indexes and state durations to the decoder, and the decoder synthesizes speech using HMM-based speech synthesis technique. One of the main problems of this vocoder is that the voice characteristics of synthetic speech depend on HMMs used in the decoder, and are therefore fixed regardless of a variety of input speakers. To overcome this problem, we adapt HMMs to input speech by transmitting transfer vectors, information on mismatch between the input speech and HMMs. The results of the subjective tests show that the performance of the proposed vocoder without quantization of transfer vectors is comparable to that of a speaker dependent vocoder.
#1 - synthesized from original spectral parameters
#2 - coded speech using speaker dependent models
#3 - coded speech using speaker independent models without adaptation
#4 - coded speech using adapted models without quantization of transfer vectors
#5 - coded speech using adapted models with quantization of transfer vectors
Bibliographic reference. Masuko, Takashi / Tokuda, Keiichi / Kobayashi, Takao (1998): "A very low bit rate speech coder using HMM with speaker adaptation", In ICSLP-1998, paper 0777.