Named entities are of great importance in spoken document processing, but speech recognizers often get them wrong because they are infrequent. A name correction method based on documentlevel name clustering is proposed in this paper, consisting of three components: named entity detection, name clustering, and name hypothesis selection. We compare the performance of this method to oracle conditions and show that the oracle gain is a 23% reduction in name character error for Mandarin and the automatic approach achieves about 20% of that.
Cite as: Zhang, B., Wu, W., Kahn, J.G., Ostendorf, M. (2009) Improving the recognition of names by document-level clustering. Proc. Interspeech 2009, 1035-1038, doi: 10.21437/Interspeech.2009-319
@inproceedings{zhang09b_interspeech, author={Bin Zhang and Wei Wu and Jeremy G. Kahn and Mari Ostendorf}, title={{Improving the recognition of names by document-level clustering}}, year=2009, booktitle={Proc. Interspeech 2009}, pages={1035--1038}, doi={10.21437/Interspeech.2009-319} }