Preliminary guidelines for the efficient management of OOV words for spoken text

Christina Tånnander, Jens Edlund


We investigate the practical short-term and long-term effects of five different frequency ranks used for selecting which out-ofvocabulary (OOV) words to add to a pronunciation lexicon for text-to-speech (TTS) of university textbooks. The work is an empirical study on a corpus of 200 university text books selected for talking book production and it takes the extensive pronunciation lexicon of a commercial text-to-speech system as its baseline. The main take-home message is a short but succinct set of guidelines that promise to increase the efficiency of OOV management, at least for text-to-speech production of university text books.


 DOI: 10.21437/SSW.2019-25

Cite as: Tånnander, C., Edlund, J. (2019) Preliminary guidelines for the efficient management of OOV words for spoken text. Proc. 10th ISCA Speech Synthesis Workshop, 137-142, DOI: 10.21437/SSW.2019-25.


@inproceedings{Tånnander2019,
  author={Christina Tånnander and Jens Edlund},
  title={{Preliminary guidelines for the efficient management of OOV words for spoken text}},
  year=2019,
  booktitle={Proc. 10th ISCA Speech Synthesis Workshop},
  pages={137--142},
  doi={10.21437/SSW.2019-25},
  url={http://dx.doi.org/10.21437/SSW.2019-25}
}