7th International Conference on Spoken Language Processing
September 16-20, 2002
This paper proposes a new lexicon optimization method to improve recognition rate of large scale spontaneous speech recognition. Occurrence count and length of a word has strong correlation with dif- ficulty of recognizing the word. First, we investigate the relation and make a word correctness probability model. The proposed method optimizes the lexicon by making compound words or phrases step by step based on the word correctness probability model so as to improve the estimated recognition rate of the system. The optimization method is applied to a large scale Japanese spontaneous speech corpus. Experimental results show that the language model using the optimized lexicon improves the recognition rate.
Bibliographic reference. Shinozaki, Takahiro / Furui, Sadaoki (2002): "A new lexicon optimization method for LVCSR based on linguistic and acoustic characteristics of words", In ICSLP-2002, 717-720.