ISCA Archive SSW 2010
ISCA Archive SSW 2010

Considering readability in text-to-speech recording script design

Minghui Dong, Ling Cen, Paul Chan, Haizhou Li

Designing text scripts that cover enough phonetic units and prosodic phenomena is very important when recording speech database for corpus based speech synthesis. When designing recording scripts for speech synthesis databases, a lot of effort is often placed on how to achieve maximal coverage of phonetic units in minimal speech recording. With such methods, sentences with difficult words or incorrect grammar are often selected. It is difficult for speakers to read these sentences correctly and naturally. Also, the selected sentences may not be suitable for child speakers or non-native speakers. In order to address these problems, we propose to consider readability in text selection. The experiment shows that the selected scripts with the proposed method have good unit coverage of the language and good readability.

Index Terms: Text-to-speech, recording scripts, text selection, text readability


Cite as: Dong, M., Cen, L., Chan, P., Li, H. (2010) Considering readability in text-to-speech recording script design. Proc. 7th ISCA Workshop on Speech Synthesis (SSW 7), 312-316

@inproceedings{dong10_ssw,
  author={Minghui Dong and Ling Cen and Paul Chan and Haizhou Li},
  title={{Considering readability in text-to-speech recording script design}},
  year=2010,
  booktitle={Proc. 7th ISCA Workshop on Speech Synthesis (SSW 7)},
  pages={312--316}
}