This paper presents a novel two-step spoken term detection (STD) method that uses the same STD engine twice and a support vector machine (SVM)-based classifier to verify detected terms from the output of the second STD engine. In the first STD process, pre-indexing of the target spoken documents from a keyword list built from the results of automatic speech recognition of the speeches is performed. The first STD process result includes a set of keywords and their detection intervals (positions) in the spoken documents. For the keywords that have competitive intervals, we rank them on the basis of the matching cost of STD and select the best one with the longest duration among competitive detections. The selected keywords are registered in the pre-index. In the second STD process, a query is searched by the same STD engine, and then, the outputted candidates are verified by an SVM classifier. Our proposed two-step STD method was evaluated using the NTCIR-10 SpokenDoc-2 STD task and it drastically outperformed the traditional STD method based on dynamic time warping and the confusion network-based index.
Bibliographic reference. Domoto, Kentaro / Utsuro, Takehito / Sawada, Naoki / Nishizaki, Hiromitsu (2015): "Two-step spoken term detection using SVM classifier trained with pre-indexed keywords based on ASR result", In INTERSPEECH-2015, 834-838.