ISCA Archive S4SG 2022
ISCA Archive S4SG 2022

Text Normalization for Speech Systems for All Languages

Athiya Deviyani, Alan W Black

Most text-to-speech systems suffer from the limitation that its inputs should be a set of strings of characters with standard pronunciation, and struggle when given input is in the form of symbols, numbers, or abbreviations that often occur in real text. One of the most common ways to address this problem is to automatically map non-standard words to standard words using statistical, neural and rule-driven methods. However, despite the significant efforts of normalizing such words, there is just too much variability in existing corpora such that it is extremely challenging to capture edge cases. In this work, we propose a tool which aids data collection from (non-programmer) native speakers to allow numbers and other common non-standard words to be mapped to standard words that can be pronounced correctly by a synthesizer, while addressing related problems such as identifying common non-standard words appear in text and how do we ask questions from native speakers to get sufficient information to allow a useful normalization of non-standard words.


doi: 10.21437/S4SG.2022-5

Cite as: Deviyani, A., Black, A.W. (2022) Text Normalization for Speech Systems for All Languages. Proc. 1st Workshop on Speech for Social Good (S4SG), 20-25, doi: 10.21437/S4SG.2022-5

@inproceedings{deviyani22_s4sg,
  author={Athiya Deviyani and Alan W Black},
  title={{Text Normalization for Speech Systems for All Languages}},
  year=2022,
  booktitle={Proc. 1st Workshop on Speech for Social Good (S4SG)},
  pages={20--25},
  doi={10.21437/S4SG.2022-5}
}