7th International Conference on Spoken Language Processing

September 16-20, 2002
Denver, Colorado, USA

Vocabulary Independent OOV Detection Using Support Vector Machines

Tommi Lahti, Janne Suontausta

Nokia Research Center, Finland

In this paper, a novel Out-of-Vocabulary (OOV) word detection method relying on phoneme-level acoustic measures and Support Vector Machines (SVM) is proposed. Word level OOV scores are computed from the phoneme level in-vocabulary (IV) and OOV information provided by an HMM based speech recognizer. The OOV word decision is based on the confidence feature vector which is processed by a SVM classifier. The decision thresholds are independent of the used test vocabulary. The performance of the proposed SVM classification scheme was experimentally compared with the word and sub-word level confidence methods. The tests indicate that the SVM based OOV rejection best generalizes the performance on the test set. While all methods were found to provide a similar performance after parameter optimization on the training set, the proposed SVM classification scheme decreased the false acceptance rate on test set by 30.4% compared with the word level confidence method and experimental decision threshold values.


Full Paper

Bibliographic reference.  Lahti, Tommi / Suontausta, Janne (2002): "Vocabulary independent OOV detection using support vector machines", In ICSLP-2002, 1609-1612.