8th European Conference on Speech Communication and Technology

Geneva, Switzerland
September 1-4, 2003


Techniques for Effective Vocabulary Selection

Anand Venkataraman, Wen Wang

SRI International, USA

The vocabulary of a continuous speech recognition (CSR) system is a significant factor in determining its performance. In this paper, we present three principled approaches to select the target vocabulary for a particular domain by trading off between the target out-of-vocabulary (OOV) rate and vocabulary size. We evaluate these approaches against an ad-hoc baseline strategy. Results are presented in the form of OOV rate graphs plotted against increasing vocabulary size for each technique.

Full Paper

Bibliographic reference.  Venkataraman, Anand / Wang, Wen (2003): "Techniques for effective vocabulary selection", In EUROSPEECH-2003, 245-248.