ISCA Archive ICSLP 2000
ISCA Archive ICSLP 2000

Corpus-based techniques in the AT&t nextgen synthesis system

Ann K. Syrdal, Colin W. Wightman, Alistair Conkie, Yannis Stylianou, Mark Beutnagel, Juergen Schroeter, Volker Strom, Ki-Seung Lee, Matthew J. Makashay

The AT&T text-to-speech (TTS) synthesis system has been used as a framework for experimenting with a perceptually-guided data-driven approach to speech synthesis, with primary focus on data-driven elements in the "back end". Statistical training techniques applied to a large corpus are used to make decisions about predicted speech events and selected speech inventory units. Our recent advances in automatic phonetic and prosodic labeling and a new faster harmonic plus noise model (HNM) and unit preselection implementations have significantly improved TTS quality and speeded up both development time and runtime.


Cite as: Syrdal, A.K., Wightman, C.W., Conkie, A., Stylianou, Y., Beutnagel, M., Schroeter, J., Strom, V., Lee, K.-S., Makashay, M.J. (2000) Corpus-based techniques in the AT&t nextgen synthesis system. Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000), vol. 3, 410-415

@inproceedings{syrdal00b_icslp,
  author={Ann K. Syrdal and Colin W. Wightman and Alistair Conkie and Yannis Stylianou and Mark Beutnagel and Juergen Schroeter and Volker Strom and Ki-Seung Lee and Matthew J. Makashay},
  title={{Corpus-based techniques in the AT&t nextgen synthesis system}},
  year=2000,
  booktitle={Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000)},
  pages={vol. 3, 410-415}
}