Eighth ISCA Workshop on Speech Synthesis

Barcelona, Catalonia, Spain
August 31-September 2, 2013

An Experimental Comparison of Multiple Vocoder Types

Qiong Hu (1), Korin Richmond (1), Junichi Yamagishi (1), Javier Latorre (2)

(1) University of Edinburgh, UK; (2) Toshiba Research Europe, UK

This paper presents an experimental comparison of a broad range of the leading vocoder types which have been previously described. We use a reference implementation of each of these to create stimuli for a listening test using copy synthesis. The listening test is performed using both Lombard and normal read speech stimuli, and with two types of question for comparison. Multi-dimensional Scaling (MDS) is conducted on the listener responses to analyse similarities in terms of quality between the vocoders. Our MDS and clustering results show that the vocoders which use a sinusoidal synthesis approach are perceptually distinguishable from the source-filter vocoders. To help further interpret the axes of the resulting MDS space, we test for correlations with standard acoustic quality metrics and find one axis is strongly correlated with PESQ scores. We also find both speech style and the format of the listening test question may influence test results. Finally, we also present preference test results which compare each vocoder with the natural speech. Index Terms: Speech Synthesis, Vocoder, Similarity, Quality

Full Paper

Bibliographic reference.  Hu, Qiong / Richmond, Korin / Yamagishi, Junichi / Latorre, Javier (2013): "An experimental comparison of multiple vocoder types", In SSW8, 135-140.