We propose a novel method for lexicon expansion using pronunciation variations extracted on the basis of speaker-related deviations in ASR error statistics. Two types of pronunciation variations were extracted: common pronunciation variations found with most speakers, and speaker-related pronunciation variations, identified on the basis of recognition error elements weighted by idf and tf-idf measures. Experimental results for CSJ show that entries added to the lexicon from speaker-related pronunciation variations were more effective than those generated on the basis of common pronunciation variations, some of which were superfluous.
Bibliographic reference. Onishi, Yoshifumi (2008): "Lexicon expansion using pronunciation variations extracted on the basis of speaker-related deviation in recognition error statistics", In INTERSPEECH-2008, 1809-1812.