This paper presents the concept of a voice profile as a complete description of the distributions of the acoustic correlates and the speaking characteristics of a speaker. A voice profile can be considered as a unified speaker-dependent probability model of speech with applications in speaker identification, adaptive speech recognition, voice morphing and text to speech synthesis. The spectral and temporal parameters that define a voice profile are obtained from hidden Markov models (HMMs) of speech. The HMMs are trained on extended feature vectors that include features for recognition, synthesis and identification. A method of ranking the acoustic correlates of a speakers voice is proposed based on an analysis of the relative distance of each voice correlate from that of the gender-dependent modal voice. The voice profile is used effectively for voice conversion. Experimental results of speaker profiling and its evaluation in voice morphing are presented.
Cite as: Rentzos, D., Vaseghi, S., Yan, Q. (2004) Voice profile: a structured probability model with application to voice morphing. Proc. The Speaker and Language Recognition Workshop (Odyssey 2004), 193-198
@inproceedings{rentzos04_odyssey, author={Dimitrios Rentzos and Saeed Vaseghi and Qin Yan}, title={{Voice profile: a structured probability model with application to voice morphing}}, year=2004, booktitle={Proc. The Speaker and Language Recognition Workshop (Odyssey 2004)}, pages={193--198} }