Sixth International Conference on Spoken Language Processing
The goal of this paper is to locate and understand the fine fundamental problems that exist in the representation of speech sounds by a very high quality speech analysis/synthesis engine namely STRAIGHT. The approach followed here is the evaluation of this system using subjective measures. We use the diagnostic rhyme test (DRT) to evaluate the intelligibility of speech analysed and synthesised by this system for various analysis frame-rates. Consequently we catagorise the fine problems and suggest possible improvements. The results from the DRT have indicated that STRAIGHT can produce speech with an average DRT score of 95 between 1-5 ms analysis frame-rate. In addition, a set of subjective quality measures using MOS and MNRU tests have been conducted. These tests have been carried out for three different versions of the STRAIGHT system: versions 17, 23 and 30. The DRT has been carried out using version 23 only. Based on the subjective evaluation results, a discussion of possible improvements to the STRAIGHT system is given.
Bibliographic reference. Zolfaghari, Parham / Atake, Yoshinori / Shikano, Kiyohiro / Kawahara, Hideki (2000): "Investigation of analysis and synthesis parameters of straight by subjective evaluation", In ICSLP-2000, vol.3, 498-501.