ISCA Archive Interspeech 2017
ISCA Archive Interspeech 2017

Character-Based Embedding Models and Reranking Strategies for Understanding Natural Language Meal Descriptions

Mandy Korpusik, Zachary Collins, James Glass

Character-based embedding models provide robustness for handling misspellings and typos in natural language. In this paper, we explore convolutional neural network based embedding models for handling out-of-vocabulary words in a meal description food ranking task. We demonstrate that character-based models combined with a standard word-based model improves the top-5 recall of USDA database food items from 26.3% to 30.3% on a test set of all USDA foods with typos simulated in 10% of the data. We also propose a new reranking strategy for predicting the top USDA food matches given a meal description, which significantly outperforms our prior method of n-best decoding with a finite state transducer, improving the top-5 recall on the all USDA foods task from 20.7% to 63.8%.


doi: 10.21437/Interspeech.2017-422

Cite as: Korpusik, M., Collins, Z., Glass, J. (2017) Character-Based Embedding Models and Reranking Strategies for Understanding Natural Language Meal Descriptions. Proc. Interspeech 2017, 3320-3324, doi: 10.21437/Interspeech.2017-422

@inproceedings{korpusik17_interspeech,
  author={Mandy Korpusik and Zachary Collins and James Glass},
  title={{Character-Based Embedding Models and Reranking Strategies for Understanding Natural Language Meal Descriptions}},
  year=2017,
  booktitle={Proc. Interspeech 2017},
  pages={3320--3324},
  doi={10.21437/Interspeech.2017-422}
}