ISCA Archive SSW 2013
ISCA Archive SSW 2013

Investigating the shortcomings of HMM synthesis

Thomas Merritt, Simon King

This paper presents the beginnings of a framework for formal testing of the causes of the current limited quality of HMM (Hidden Markov Model) speech synthesis. This framework separates each of the effects of modelling to observe their independent effects on vocoded speech parameters in order to address the issues that are restricting the progression to highly intelligible and natural-sounding speech synthesis. The simulated HMM synthesis conditions are performed on spectral speech parameters and tested via a pairwise listening test, asking listeners to perform a "same or different" judgement on the quality of the synthesised speech produced between these conditions. These responses are then processed using multidimensional scaling to identify the qualities in modelled speech that listeners are attending to and thus forms the basis of why they are distinguishable from natural speech. The future improvements to be made to the framework will finally be discussed which include the extension to more of the parameters modelled during speech synthesis.

Index Terms: Speech synthesis, Hidden Markov models, Vocoding

Cite as: Merritt, T., King, S. (2013) Investigating the shortcomings of HMM synthesis. Proc. 8th ISCA Workshop on Speech Synthesis (SSW 8), 165-170

  author={Thomas Merritt and Simon King},
  title={{Investigating the shortcomings of HMM synthesis}},
  booktitle={Proc. 8th ISCA Workshop on Speech Synthesis (SSW 8)},