In this paper, we propose an error correction method using text corpora. In this method, recognition errors are corrected using phonetically similar examples in the text corpora. The reliability of the correction hypotheses are judged according to their semantic consistency and their phonetic similarity to the original input. We previously proposed an error correction method that uses a treebank [1]. However, the previous method was not flexible in its use of examples, because structural mismatches occurred between the input and examples due to recognition errors. In our new proposal, examples are treated as morpheme sequences. This enables us to use examples partially when there are no useful full-sentence-examples. We built our proposed method into a speech translation system and compared the translation quality for simple translation and translation with error correction. The rate of acceptable translation increased about 10% with our proposed method compared to simple translation.
Cite as: Ishikawa, K., Sumita, E. (1999) Error correction translation using text corpora. Proc. 6th European Conference on Speech Communication and Technology (Eurospeech 1999), 1995-1998, doi: 10.21437/Eurospeech.1999-440
@inproceedings{ishikawa99_eurospeech, author={Kai Ishikawa and Eiichiro Sumita}, title={{Error correction translation using text corpora}}, year=1999, booktitle={Proc. 6th European Conference on Speech Communication and Technology (Eurospeech 1999)}, pages={1995--1998}, doi={10.21437/Eurospeech.1999-440} }