Eighth ISCA Workshop on Speech Synthesis
Barcelona, Catalonia, Spain
This paper presents the beginnings of a framework for formal testing of the causes
of the current limited quality of HMM (Hidden Markov Model) speech synthesis.
This framework separates each of the effects of modelling to observe their independent
effects on vocoded speech parameters in order to address the issues that
are restricting the progression to highly intelligible and natural-sounding speech
The simulated HMM synthesis conditions are performed on spectral speech parameters and tested via a pairwise listening test, asking listeners to perform a "same or different" judgement on the quality of the synthesised speech produced between these conditions. These responses are then processed using multidimensional scaling to identify the qualities in modelled speech that listeners are attending to and thus forms the basis of why they are distinguishable from natural speech.
The future improvements to be made to the framework will finally be discussed which include the extension to more of the parameters modelled during speech synthesis. Index Terms: Speech synthesis, Hidden Markov models, Vocoding
Bibliographic reference. Merritt, Thomas / King, Simon (2013): "Investigating the shortcomings of HMM synthesis", In SSW8, 165-170.