In this paper, we propose novel methods that utilize prior mispronunciation knowledge extracted from large L2 speech corpus to improve segmental mispronunciation detection performance. Mispronunciation rules are categorized and the occurrence frequency of each error type is calculated from phone-level annotation of the corpora. Based on these rules and statistics of mispronunciations, we construct extended pronunciation lexicons with prior probabilities that reflect how likely each type of error might occur as language models for ASR. A two-pass confusion network based strategy, which uses posterior probability scores with optimal thresholds estimated from the L2 speech corpus, is introduced to refine phone recognition results. Experimental results show that the proposed methods can improve mispronunciation detection performance rather significantly.
Bibliographic reference. Luo, Dean / Yang, Xuesong / Wang, Lan (2011): "Improvement of segmental mispronunciation detection with prior knowledge extracted from large L2 speech corpus", In INTERSPEECH-2011, 1593-1596.