Evaluation of Singing Synthesis: Methodology and Case Study with Concatenative and Performative Systems

Lionel Feugère, Christophe d’Alessandro, Samuel Delalez, Luc Ardaillon, Axel Roebel


The special session Singing Synthesis Challenge: Fill-In the Gap aims at comparative evaluation of singing synthesis systems. The task is to synthesize a new couplet for two popular songs. This paper address the methodology needed for quality assessment of singing synthesis systems and reports on a case study using 2 systems with a total of 6 different configurations. The two synthesis systems are: a concatenative Text-to-Chant (TTC) system, including a parametric representation of the melodic curve; a Singing Instrument (SI), allowing for real-time interpretation of utterances made of flat-pitch natural voice or diphone concatenated voice. Absolute Category Rating (ACR) and Paired Comparison (PC) tests are used. Natural and natural-degraded reference conditions are used for calibration of the ACR test. The MOS obtained using ACR shows that the TTC (resp. the SI) ranks below natural voice but above (resp. in between) degraded conditions. Then singing synthesis quality is judged better than auto-tuned or distorted natural voice in some cases. PC results show that: 1/ signal processing is an important quality issue, making the difference between systems; 2/ diphone concatenation degrades the quality compared to flat-pitch natural voice; 3/ Automatic melodic modelling is preferred to gestural control for off-line synthesis.


DOI: 10.21437/Interspeech.2016-1248

Cite as

Feugère, L., d’Alessandro, C., Delalez, S., Ardaillon, L., Roebel, A. (2016) Evaluation of Singing Synthesis: Methodology and Case Study with Concatenative and Performative Systems. Proc. Interspeech 2016, 1245-1249.

Bibtex
@inproceedings{Feugère+2016,
author={Lionel Feugère and Christophe d’Alessandro and Samuel Delalez and Luc Ardaillon and Axel Roebel},
title={Evaluation of Singing Synthesis: Methodology and Case Study with Concatenative and Performative Systems},
year=2016,
booktitle={Interspeech 2016},
doi={10.21437/Interspeech.2016-1248},
url={http://dx.doi.org/10.21437/Interspeech.2016-1248},
pages={1245--1249}
}