Sixth International Conference on Spoken Language Processing
(ICSLP 2000)

Beijing, China
October 16-20, 2000

Input Chinese Sentences Using Digits

Fang Zheng, Jian Wu, Wenhu Wu

Center of Speech Technology, State Key Laboratory of Intelligent Technology and Systems, Department of Computer Science and Technology, Tsinghua University, Beijing, China

Chinese character input is always a key issue in a variety of Chinese based applications especially when only a small number keypad is available. Though many kinds of Chinese character encoding schemes are proposed according to Chinese character characteristics, such as the shape, they are not straightforward and will take users a long time to learn. An easy way is to input via Chinese pinyins. In this paper, we establish the mapping between digit string and pinyin as well as the mapping between the pinyin string and the word, referred to as the Syllable-Digit search Tree (SDT) and the Word-Syllable search Tree (WST) respectively. By using these two search trees as well as the word N-gram language model and the syllable-synchronous network search (SSNS) algorithm, any digit string can be easily converted into Chinese word sequence or sentence. Without users’ selecting from candidates, the character error rate (CER) of digit-to-character (D/C) conversion is 6.6% across a test text consisting 22,083 characters.


Full Paper

Bibliographic reference.  Zheng, Fang / Wu, Jian / Wu, Wenhu (2000): "Input Chinese sentences using digits", In ICSLP-2000, vol.3, 127-130.