Two applications of statistically-generated decision trees to problems in speech synthesis are described: (1) End of sentence detection: A decision tree is generated to decide when a period in text corresponds to the end of a declarative sentence (and not an abbreviation). The result is 99.8% correct classification on the Brown corpus. (2) Segment duration modelling in speech synthesis: 1500 utterances from a single speaker were used to a build a decision tree that predicts segment durations based on features such as lexical position, stress, and phonetic context. The result is prediction with residuals with a 23 millisecond standard deviation and synthesis that compares favorably with current hand-generated duration rules.
Cite as: Riley, M.D. (1990) Tree-based modelling for speech synthesis. Proc. First ESCA Workshop on Speech Synthesis (SSW 1), 229-232
@inproceedings{riley90_ssw, author={Michael D. Riley}, title={{Tree-based modelling for speech synthesis}}, year=1990, booktitle={Proc. First ESCA Workshop on Speech Synthesis (SSW 1)}, pages={229--232} }