EUROSPEECH 2003 - INTERSPEECH 2003
The vocabulary of a continuous speech recognition (CSR) system is a significant factor in determining its performance. In this paper, we present three principled approaches to select the target vocabulary for a particular domain by trading off between the target out-of-vocabulary (OOV) rate and vocabulary size. We evaluate these approaches against an ad-hoc baseline strategy. Results are presented in the form of OOV rate graphs plotted against increasing vocabulary size for each technique.
Bibliographic reference. Venkataraman, Anand / Wang, Wen (2003): "Techniques for effective vocabulary selection", In EUROSPEECH-2003, 245-248.