This paper describes a Japanese dictation system that can effectively deal with an unlimited vocabulary. An approach that automatically generates a character n-gram syntax, from both the original training text and newly generated text is proposed. Each sentence phrase of the newly generated text that is not included in the training text is created by using the training text and the character trigram model of that text. About one-third of the search space not covered by the training text is covered by the newly generated text, showing the effectiveness of the text auto-generation approach. Furthermore, compared with the common beam-search technique, the proposed search technique requires about three-fourths less processing time and allows more accurate recognition.
Keywords: Speech Recognition, Language Modeling, Searching
Bibliographic reference. Matsunaga, Shoichi / Yamada, Tomokazu / Shikano, Kiyohiro (1993): "Dictation system using inductively auto-generated syntax", In EUROSPEECH'93, 2135-2138.