In this study, considering the effect of phrase grouping in spontaneous speech, prosodic words, instead of lexical words, are adopted as the units for error correction of speech recognition results. The prosodic words and the corresponding mis-recognized word fragments are obtained from a speech database to construct a mis-recognized word fragment table for the extracted prosodic words. For each word fragment in a recognized word sequence, the potential prosodic words which are likely to be misrecognized as input word fragments are retrieved from the table for prosodic word candidate expansion. The prosodic word-based contextual information, considering substitution and concatenation scores, is then employed into a probabilistic model to find the best word fragment sequence as the corrected output. Experimental results show that the proposed method achieved a 0.32 F1 score, with improvements of 0.18 and 0.10 compared to the SMT-based and lexical word-based approaches, respectively.
Bibliographic reference. Liu, Chao-Hong / Wu, Chung-Hsien (2010): "Prosodic word-based error correction in speech recognition using prosodic word expansion and contextual information", In INTERSPEECH-2010, 1385-1388.