ISCA Archive ICSLP 2000
Input Chinese sentences using digits

Fang Zheng, Jian Wu, Wenhu Wu

Chinese character input is always a key issue in a variety of Chinese based applications especially when only a small number keypad is available. Though many kinds of Chinese character encoding schemes are proposed according to Chinese character characteristics, such as the shape, they are not straightforward and will take users a long time to learn. An easy way is to input via Chinese pinyins. In this paper, we establish the mapping between digit string and pinyin as well as the mapping between the pinyin string and the word, referred to as the Syllable-Digit search Tree (SDT) and the Word-Syllable search Tree (WST) respectively. By using these two search trees as well as the word N-gram language model and the syllable-synchronous network search (SSNS) algorithm, any digit string can be easily converted into Chinese word sequence or sentence. Without usersÂ’ selecting from candidates, the character error rate (CER) of digit-to-character (D/C) conversion is 6.6% across a test text consisting 22,083 characters.

Cite as: Zheng, F., Wu, J., Wu, W. (2000) Input Chinese sentences using digits. Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000), vol. 3, 127-130

