A Hybrid Approach to Grapheme to Phoneme Conversion in Assamese

Somnath Roy, Shakuntala Mahanta


Assamese is one of the low resource Indian languages. This paper implements both rule-based and data-driven grapheme to phoneme (G2P) conversion systems for Assamese. The rule-based system is used as the baseline which yields a word error rate of 35.3%. The data-driven systems are implemented using state-of-the-art sequence learning techniques such as —i) Joint-Sequence Model (JSM), ii) Recurrent Neural Networks with LTSM cell (LSTM-RNN) and iii) bidirectional LSTM (BiLSTM). The BiLSTM yields the lowest WER i.e., 18.7%, which is an absolute 16.6% improvement on the baseline system. We additionally implement the rules of syllabification for Assamese. The surface output is generated in two forms namely i) phonemic sequence with syllable boundaries and ii) only phonemic sequence. The output of BiLSTM is fed as an input to Hybrid system. The Hybrid system syllabifies the input phonemic sequences to apply the vowel harmony rules. It also applies the rules of schwa-deletion as well as some rules in which the consonants change their form in clusters. The accuracy of the Hybrid system is 17.3% which is an absolute 1.4% improvement over the BiLSTM based G2P.


 DOI: 10.21437/Interspeech.2018-1694

Cite as: Roy, S., Mahanta, S. (2018) A Hybrid Approach to Grapheme to Phoneme Conversion in Assamese. Proc. Interspeech 2018, 2828-2832, DOI: 10.21437/Interspeech.2018-1694.


@inproceedings{Roy2018,
  author={Somnath Roy and Shakuntala Mahanta},
  title={A Hybrid Approach to Grapheme to Phoneme Conversion in Assamese},
  year=2018,
  booktitle={Proc. Interspeech 2018},
  pages={2828--2832},
  doi={10.21437/Interspeech.2018-1694},
  url={http://dx.doi.org/10.21437/Interspeech.2018-1694}
}