Efficient Thai Grapheme-to-Phoneme Conversion Using CRF-Based Joint Sequence Modeling

Sittipong Saychum, Sarawoot Kongyoung, Anocha Rugchatjaroen, Patcharika Chootrakool, Sawit Kasuriya, Chai Wutiwiwatchai


This paper presents the successful results of applying joint sequence modeling in Thai grapheme-to-phoneme conversion. The proposed method utilizes Conditional Random Fields (CRFs) in two-stage prediction. The first CRF is used for textual syllable segmentation and syllable type prediction. Graphemes and their corresponding phonemes are then aligned using well-designed many-to-many alignment rules and outputs given by the first CRF. The second CRF, modeling the jointly aligned sequences, efficiently predicts phonemes. The proposed method obviously improves the prediction of linking syllables, normally hidden from their textual graphemes. Evaluation results show that the prediction word error rate (WER) of the proposed method reaches 13.66%, which is 11.09% lower than that of the baseline system.


DOI: 10.21437/Interspeech.2016-621

Cite as

Saychum, S., Kongyoung, S., Rugchatjaroen, A., Chootrakool, P., Kasuriya, S., Wutiwiwatchai, C. (2016) Efficient Thai Grapheme-to-Phoneme Conversion Using CRF-Based Joint Sequence Modeling. Proc. Interspeech 2016, 1462-1466.

Bibtex
@inproceedings{Saychum+2016,
author={Sittipong Saychum and Sarawoot Kongyoung and Anocha Rugchatjaroen and Patcharika Chootrakool and Sawit Kasuriya and Chai Wutiwiwatchai},
title={Efficient Thai Grapheme-to-Phoneme Conversion Using CRF-Based Joint Sequence Modeling},
year=2016,
booktitle={Interspeech 2016},
doi={10.21437/Interspeech.2016-621},
url={http://dx.doi.org/10.21437/Interspeech.2016-621},
pages={1462--1466}
}