ISCA Archive SSW 2007
ISCA Archive SSW 2007

SVM based feature extraction in speech synthesis

Peter Cahill, Jan Macek, Julie Carson-Berndsen

Annotations of speech recordings are a fundamental part of any unit selection speech synthesiser. However, obtaining flawless annotations is an almost impossible task. Manual techniques can achieve themost accurate annotations, provided that enough time is available to analyse every phone individually. Automatic annotation techniques are a lot faster than manual, doing the task in a much more reasonable time frame, but such annotations contain a considerable amount of error. In this paper a technique is introduced that can quite accurately ensure a degree of articulatory-acoustic similarity between annotated units. The synthesiser will encourage the use of units that have been identified to have appropriate articulatory-acoustic parameters, but will not limit the domain of the speech database. This helps to identify where joins can be performed best and also identifies which annotations should be avoided at the phone level.

Cite as: Cahill, P., Macek, J., Carson-Berndsen, J. (2007) SVM based feature extraction in speech synthesis. Proc. 6th ISCA Workshop on Speech Synthesis (SSW 6), 328-332

  author={Peter Cahill and Jan Macek and Julie Carson-Berndsen},
  title={{SVM based feature extraction in speech synthesis}},
  booktitle={Proc. 6th ISCA Workshop on Speech Synthesis (SSW 6)},