We investigate the practical short-term and long-term effects of five different frequency ranks used for selecting which out-ofvocabulary (OOV) words to add to a pronunciation lexicon for text-to-speech (TTS) of university textbooks. The work is an empirical study on a corpus of 200 university text books selected for talking book production and it takes the extensive pronunciation lexicon of a commercial text-to-speech system as its baseline. The main take-home message is a short but succinct set of guidelines that promise to increase the efficiency of OOV management, at least for text-to-speech production of university text books.
Cite as: Tånnander, C., Edlund, J. (2019) Preliminary guidelines for the efficient management of OOV words for spoken text. Proc. 10th ISCA Workshop on Speech Synthesis (SSW 10), 137-142, doi: 10.21437/SSW.2019-25
@inproceedings{tannander19_ssw, author={Christina Tånnander and Jens Edlund}, title={{Preliminary guidelines for the efficient management of OOV words for spoken text}}, year=2019, booktitle={Proc. 10th ISCA Workshop on Speech Synthesis (SSW 10)}, pages={137--142}, doi={10.21437/SSW.2019-25} }