Conditional Random Fields for the Tunisian Dialect Grapheme-to-Phoneme Conversion

Abir Masmoudi, Mariem Ellouze, Fethi Bougares, Yannick Esètve, Lamia Belguith


Conditional Random Fields (CRFs) represent an effective approach for monotone string-to-string translation tasks. In this work, we apply the CRF model to perform grapheme-to-phoneme (G2P) conversion for the Tunisian Dialect. This choice is motivated by the fact that CRFs give a long term prediction and assume relaxed state independence conditions compared to HMMs [7]. The CRF model needs to be trained on a 1-to-1 alignement between graphemes and phonemes. Alignments are generated using Joint-Multigram Model (JMM) and GIZA++ toolkit. We trained CRF model for each generated alignment. We then compared our models to state-of-the-art G2P systems based on Sequitur G2P and Phonetisaurus toolkit. We also investigate the CRF prediction quality with different training size. Our results show that CRF perform slightly better using JMM alignment and outperform both Sequitur and Phonetisaurus systems with different training size. At the end, our system gets a phone error rate of 14.09%.


DOI: 10.21437/Interspeech.2016-1320

Cite as

Masmoudi, A., Ellouze, M., Bougares, F., Esètve, Y., Belguith, L. (2016) Conditional Random Fields for the Tunisian Dialect Grapheme-to-Phoneme Conversion. Proc. Interspeech 2016, 1457-1461.

Bibtex
@inproceedings{Masmoudi+2016,
author={Abir Masmoudi and Mariem Ellouze and Fethi Bougares and Yannick Esètve and Lamia Belguith},
title={Conditional Random Fields for the Tunisian Dialect Grapheme-to-Phoneme Conversion},
year=2016,
booktitle={Interspeech 2016},
doi={10.21437/Interspeech.2016-1320},
url={http://dx.doi.org/10.21437/Interspeech.2016-1320},
pages={1457--1461}
}