On Training and Evaluation of Grapheme-to-Phoneme Mappings with Limited Data

Dravyansh Sharma


When scaling to low resource languages for speech synthesis or speech recognition in an industrial setting, a common challenge is the absence of a readily available pronunciation lexicon. Common alternatives are handwritten letter-to-sound rules and data-driven grapheme-to-phoneme (G2P) models, but without a pronunciation lexicon it is hard to even determine their quality. We identify properties of a good quality metric and note drawbacks of naive estimates of G2P quality in the domain of small test sets. We demonstrate a novel method for reliable evaluation of G2P accuracy with minimal human effort. We also compare behavior of known state-of-the-art approaches for training with limited data. Finally we evaluate a new active learning approach for training G2P models in the low resource setting.


 DOI: 10.21437/Interspeech.2018-1920

Cite as: Sharma, D. (2018) On Training and Evaluation of Grapheme-to-Phoneme Mappings with Limited Data. Proc. Interspeech 2018, 2858-2862, DOI: 10.21437/Interspeech.2018-1920.


@inproceedings{Sharma2018,
  author={Dravyansh Sharma},
  title={On Training and Evaluation of Grapheme-to-Phoneme Mappings with Limited Data},
  year=2018,
  booktitle={Proc. Interspeech 2018},
  pages={2858--2862},
  doi={10.21437/Interspeech.2018-1920},
  url={http://dx.doi.org/10.21437/Interspeech.2018-1920}
}