![]() |
ESCA Tutorial and Research Workshop on
|
![]() |
As text-to-speech systems develop it becomes necessary to compare various solutions and to evaluate whether a change in the synthesis procedure has an effect on the listener's attitude to the system. The topic of this investigation is the possibility of directly scaling intelligibility, naturalness, and user's satisfaction (i.e. acceptability) with the magnitude estimation (ME) technique. The subject in a classical ME experiment is required to make direct numerical estimations of the sensory magnitudes produced by different stimuli. In the first experiment it is assessed whether the ME judgements vary with the number and range of test conditions, whether they depend on the subject's familiarity with the test material, and whether the ME scales are practice invariant. In the second experiment the relationship between the "objective" measures of speech intelligibility (proportion of words understood correctly) and the "subjective" measures (MEs) is evaluated. Further, the relationship between the speech recognition scores on semantically correct and semantically-anomalous sentences is investigated. In the third experiment it is studied how the ME scales of acceptability, naturalness, and intelligibility vary with the severity of external distortion (noise). It is also investigated whether there are important dependencies among acceptability, naturalness, and intelligibility.
Bibliographic reference. Pavlovic, Chaslav V. / Rossi, Mario / Espesser, Robert (1989): "Subjective assessment of acceptability, intelligibility and naturalness of text-to-speech synthesis", In SIOA-1989, Vol.2, 94-98.