The defining property of a Concept-to-Speech system is that it combines language and speech generation. Language generation converts the input-concepts into natural language, which speech generation subsequently transforms into speech. Potentially, this leads to a more `natural sounding' output than can be achieved in a plain Text-to-Speech system, since the correct placement of pitch accents and intonational boundaries ---an important factor contributing to the `naturalness' of the generated speech--- is co-determined by syntactic and discourse information, which is typically available in the language generation module. In this paper, a generic algorithm for the generation of coherent spoken monologues is discussed, called D2S. Language generation is done by a module called LGM which is based on TAG-like syntactic structures with open slots, combined with conditions which determine when the syntactic structure can be used properly. A speech generation module converts the output of the LGM into speech using either phrase-concatenation or diphone-synthesis.
Cite as: Klabbers, E., Krahmer, E., Theune, M. (1998) A generic algorithm for generating spoken monologues. Proc. 5th International Conference on Spoken Language Processing (ICSLP 1998), paper 0278, doi: 10.21437/ICSLP.1998-576
@inproceedings{klabbers98b_icslp, author={Esther Klabbers and Emiel Krahmer and Mariet Theune}, title={{A generic algorithm for generating spoken monologues}}, year=1998, booktitle={Proc. 5th International Conference on Spoken Language Processing (ICSLP 1998)}, pages={paper 0278}, doi={10.21437/ICSLP.1998-576} }