Named entities are of great importance in spoken document processing, but speech recognizers often get them wrong because they are infrequent. A name correction method based on documentlevel name clustering is proposed in this paper, consisting of three components: named entity detection, name clustering, and name hypothesis selection. We compare the performance of this method to oracle conditions and show that the oracle gain is a 23% reduction in name character error for Mandarin and the automatic approach achieves about 20% of that.
Bibliographic reference. Zhang, Bin / Wu, Wei / Kahn, Jeremy G. / Ostendorf, Mari (2009): "Improving the recognition of names by document-level clustering", In INTERSPEECH-2009, 1035-1038.