This paper describes an improved algorithm, motivated by fuzzy logic theory, for the selection of speech segments for concatenative synthesis from a huge database. Triphone HMM clustering is employed as an adaptive measure for articulatory similarity within a given database. Stress level contours are evaluated in the context of their surrounding vocalic peaks. The algorithm uses a beam search technique to optimise the suitability of each candidate unit to realise the desired target as well as continuity in concatenation.
Cite as: Holzapfel, M., Campbell, N. (1998) A nonlinear unit selection strategy for concatenative speech synthesis based on syllable level features. Proc. 5th International Conference on Spoken Language Processing (ICSLP 1998), paper 0521, doi: 10.21437/ICSLP.1998-53
@inproceedings{holzapfel98_icslp, author={Martin Holzapfel and Nick Campbell}, title={{A nonlinear unit selection strategy for concatenative speech synthesis based on syllable level features}}, year=1998, booktitle={Proc. 5th International Conference on Spoken Language Processing (ICSLP 1998)}, pages={paper 0521}, doi={10.21437/ICSLP.1998-53} }