The Seventh ISCA Tutorial and Research Workshop on Speech Synthesis
Designing text scripts that cover enough phonetic units and prosodic phenomena is very important when recording speech database for corpus based speech synthesis. When designing recording scripts for speech synthesis databases, a lot of effort is often placed on how to achieve maximal coverage of phonetic units in minimal speech recording. With such methods, sentences with difficult words or incorrect grammar are often selected. It is difficult for speakers to read these sentences correctly and naturally. Also, the selected sentences may not be suitable for child speakers or non-native speakers. In order to address these problems, we propose to consider readability in text selection. The experiment shows that the selected scripts with the proposed method have good unit coverage of the language and good readability.
Index Terms: Text-to-speech, recording scripts, text selection, text readability
Bibliographic reference. Dong, Minghui / Cen, Ling / Chan, Paul / Li, Haizhou (2010): "Considering readability in text-to-speech recording script design", In SSW7-2010, 312-316.