14thAnnual Conference of the International Speech Communication Association

Lyon, France
August 25-29, 2013

Acceleration of Spoken Term Detection Using a Suffix Array by Assigning Optimal Threshold Values to Sub-Keywords

Kouichi Katsurada (1), Seiichi Miura (1), Kheang Seng (1), Yurie Iribe (2), Tsuneo Nitta (1)

(1) Toyohashi University of Technology, Japan
(2) Aichi Prefectural University, Japan

We previously proposed a fast spoken term detection method that uses a suffix array data structure for searching large-scale speech documents. The method reduces search time via techniques such as keyword division and iterative lengthening search. In this paper, we propose a statistical method of assigning different threshold values to sub-keywords to further accelerate search. Specifically, the method estimates the numbers of results for keyword searches and then reduces them by adjusting the threshold values assigned to sub-keywords. We also investigate the theoretical condition that must be satisfied by these threshold values. Experiments show that the proposed search method is 10% to 30% faster than previous methods.

Full Paper

Bibliographic reference.  Katsurada, Kouichi / Miura, Seiichi / Seng, Kheang / Iribe, Yurie / Nitta, Tsuneo (2013): "Acceleration of spoken term detection using a suffix array by assigning optimal threshold values to sub-keywords", In INTERSPEECH-2013, 11-14.