FAAVSP - The 1st Joint Conference on
Facial Analysis, Animation, and
In this paper we evaluate two different methods for the visual synthesis of Austrian German dialects with parametric Hidden- Semi-Markov-Model (HSMM) based speech synthesis. One method uses visual dialect data, i.e. visual dialect recordings that are annotated with dialect phonetic labels, the other methods uses a standard visual model and maps dialect phones to standard phones. This second method is more easily applicable since most often visual dialect data is not available. Both methods employ contextual information via decision tree based visual clustering of dialect or standard visual data. We show that both models achieve a similar performance on a subjective pair-wise comparison test. This shows that visual dialect data is not necessarily needed for visual modeling of dialects if a dialect to standard mapping can be used that exploits the contextual information of the standard language. Index Terms: visual speech synthesis, dialect
Bibliographic reference. Schabus, Dietmar / Pucher, Michael (2015): "Comparison of dialect models and phone mappings in HSMM-based visual dialect speech synthesis", In FAAVSP-2015, 84-87.