We present an evaluation of the perception of foreign-accented natural and synthetic speech in comparison to accent-reduced synthetic speech. Our method for foreign accent conversion is based on mapping of Hidden Semi-Markov Model states between accented and non-accented voice models and does not need an average voice model of accented speech. We employ the method on recorded data of speakers with first language (L1) from different European countries and second language (L2) being Austrian German. Results from a subjective evaluation show that the proposed method is able to significantly reduce the perceived accent. It also retains speaker similarity when an average voice model of the same gender is used. Accentedness of synthetic speech was rated significantly lower than natural speech by the participants and listeners were unable to identify accents correctly for 81% of the natural and 85% of the synthesized samples. Our evaluation shows the feasibility of accent conversion with a limited amount of speech resources.
Bibliographic reference. Toman, Markus / Pucher, Michael (2015): "Evaluation of state mapping based foreign accent conversion", In INTERSPEECH-2015, 304-308.