Third International Conference on Spoken Language Processing (ICSLP 94)
The Welsh language is comparatively little researched, and this work represents the first attempt to develop speech technology for Welsh. A list of pseudo-Welsh nonsense words was generated. Certain linguistic features of Welsh, such as the relationship between stress location and phonological vowel length, made this task more complicated than for English. A native speaker was recorded reading this list. Over ten percent of the speech was segmented by hand. The segmentation was carried out at the "demi-phoneme" level, from the beginning of a phoneme to its midpoint, in order to train a segmenter to find plausible diphone boundaries automatically. The segmentations were used to train a set of Hidden Markov Models, which automatically segmented the rest of the recordings. The segmentations were corrected by hand, and pitchrnarking was carried out. An index of diphone locations was produced, together with a diphone dictionary. The resulting synthesised speech can be used either for Welsh, or for English spoken with a Welsh accent.
Bibliographic reference. Williams, Briony (1994): "Diphone synthesis for the welsh language", In ICSLP-1994, 739-742.