Named entity recognition (NER) is usually developed and tested on text
from well-written sources. However, in intelligent voice assistants,
where NER is an important component, input to NER may be noisy because
of user or speech recognition error. In applications, entity labels
may change frequently, and non-textual properties like topicality or
popularity may be needed to choose among alternatives.
We describe a NER
system intended to address these problems. We test and train this system
on a proprietary user-derived dataset. We compare with a baseline text-only
NER system; the baseline enhanced with external gazetteers; and the
baseline enhanced with the search and indirect labelling techniques
we describe below. The final configuration gives around 6% reduction
in NER error rate. We also show that this technique improves related
tasks, such as semantic parsing, with an improvement of up to 5% in
error rate.
Cite as: Muralidharan, D., Moniz, J.R.A., Zhang, W., Pulman, S., Li, L., Barnes, M., Pan, J., Williams, J., Acero, A. (2021) DEXTER: Deep Encoding of External Knowledge for Named Entity Recognition in Virtual Assistants. Proc. Interspeech 2021, 1234-1238, doi: 10.21437/Interspeech.2021-1877
@inproceedings{muralidharan21_interspeech, author={Deepak Muralidharan and Joel Ruben Antony Moniz and Weicheng Zhang and Stephen Pulman and Lin Li and Megan Barnes and Jingjing Pan and Jason Williams and Alex Acero}, title={{DEXTER: Deep Encoding of External Knowledge for Named Entity Recognition in Virtual Assistants}}, year=2021, booktitle={Proc. Interspeech 2021}, pages={1234--1238}, doi={10.21437/Interspeech.2021-1877} }