Keyword spotting is an efficient approach for search of relevant recordings in databases of recorded unconstrained speech. Many algorithms have been proposed in the past for this problem and several techniques claim to be very efficient and accurate. Researchers have so far attempted to correctly compare their results by using standardized Receiver Operating Characteristic (ROC) curves, and performing experiments on publicly available databases with known keywords.
However, when it comes to compare the expected behavior of a technique for new keywords and utterances, the generalization of published comparisons is not very clear, and the choice of the benchmark-keywords has considerable effects on the comparison. In this paper we propose a new measure of the accuracy of a keyword spotter, removing the benchmark-keywords selection bias and offering a qualitative estimation of how well the technique is expected to perform on new keywords.
We apply our evaluation scheme to compare previously known algorithms as well as a new technique that we propose now. The new technique is based on a confidence measure that evaluates a keyword match to the worst of its phoneme scores (where the score of a phoneme is taken as the ratio between the log probability of that phoneme and the length of the phoneme). It is remarkable that the newly proposed technique can detect all occurences of 100 keywords with less than .5 false alarms/keyword/hour.
Cite as: Silaghi, M.C., Vargiya, R. (2005) A new evaluation criteria for keyword spotting techniques and a new algorithm. Proc. Interspeech 2005, 1593-1596, doi: 10.21437/Interspeech.2005-465
@inproceedings{silaghi05_interspeech, author={Marius C. Silaghi and Rachna Vargiya}, title={{A new evaluation criteria for keyword spotting techniques and a new algorithm}}, year=2005, booktitle={Proc. Interspeech 2005}, pages={1593--1596}, doi={10.21437/Interspeech.2005-465} }