Personalized Quantification of Voice Attractiveness in Multidimensional Merit Space

Yasunari Obuchi


Voice attractiveness is an indicator which is somehow objective and somehow subjective. It would be helpful to assume that each voice has its own attractiveness. However, the paired comparison results of human listeners sometimes include inconsistency. In this paper, we propose a multidimensional mapping scheme of voice attractiveness, which explains the existence of objective merit values of voices and subjective preference of listeners. Paired comparison is modeled in a probabilistic framework, and the optimal mapping is obtained from the paired comparison results on the maximum likelihood criterion.

The merit values can be estimated from the acoustic feature using the machine learning framework. We show how the estimation process works using real database consisting of common Japanese greeting utterances. Experiments using 1- and 2- dimensional merit spaces confirm that the comparison result prediction from the acoustic feature becomes more accurate in the 2-dimensional case.


 DOI: 10.21437/Interspeech.2017-130

Cite as: Obuchi, Y. (2017) Personalized Quantification of Voice Attractiveness in Multidimensional Merit Space. Proc. Interspeech 2017, 2223-2227, DOI: 10.21437/Interspeech.2017-130.


@inproceedings{Obuchi2017,
  author={Yasunari Obuchi},
  title={Personalized Quantification of Voice Attractiveness in Multidimensional Merit Space},
  year=2017,
  booktitle={Proc. Interspeech 2017},
  pages={2223--2227},
  doi={10.21437/Interspeech.2017-130},
  url={http://dx.doi.org/10.21437/Interspeech.2017-130}
}