ISCA Archive ICSLP 2000
ISCA Archive ICSLP 2000

Perception of synthesized singing voices with fine fluctuations in their fundamental frequency contours

Masato Akagi, Hironori Kitakaze

This paper demonstrates the importance of fine fluctuations quantitatively by measuring the detection thresholds of fine fluctuations in singing-voice F0s, in which voice quality is particularly important. We analyzed the fine fluctuations left by subtracting the melody and vibrato components from estimated F0s, focusing on the modulation frequency (MF) and modulation amplitude (MA). To test a hypothesis that the fine fluctuations in the F0 of singing voices affect the perception of quality and that the magnitude of this effect depends on the MF and MA, we performed four psychoacoustic experiments using synthesized stimuli. The experimental results indicate that our hypothesis was correct, and suggest that, to produce high-quality synthesized speech, one should extract F0s containing fine fluctuations with an MF of over 7 Hz in the analysis and add not only melody and vibrato but also fine fluctuation components to the F0 contours in the synthesis.


Cite as: Akagi, M., Kitakaze, H. (2000) Perception of synthesized singing voices with fine fluctuations in their fundamental frequency contours. Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000), vol. 3, 458-461

@inproceedings{akagi00_icslp,
  author={Masato Akagi and Hironori Kitakaze},
  title={{Perception of synthesized singing voices with fine fluctuations in their fundamental frequency contours}},
  year=2000,
  booktitle={Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000)},
  pages={vol. 3, 458-461}
}