In this paper, we propose a technique for detecting keywords quickly from a very large speech database without using a large memory space. To accelerate searches and save memory, we used a suffix array as the data structure and applied phoneme-based DP-matching. To avoid an exponential increase in the process time with the length of the keyword, a long keyword is divided into short sub-keywords. Moreover, an iterative lengthening search algorithm is used to rapidly output accurate search results. The experimental results show that it takes less than 100ms to detect the first set of search results from a 10,000-h virtual speech database.
Bibliographic reference. Katsurada, Kouichi / Teshima, Shigeki / Nitta, Tsuneo (2009): "Fast keyword detection using suffix array", In INTERSPEECH-2009, 2147-2150.