September 22-25, 1997
This paper describes keyword spotting using prosodic information as well as phonemic information. A Japanese word has its own F0 contour based on the lexical accent type and the F0 contour is preserved in sentences. Prosodic dissimilarity between a keyword and input speech is measured by DP matching of F0 contours. Phonemic score is calculated by a conventional HMM technique. A total score based on these two measures is used for detecting keywords. The F0 contour of the keyword is smoothed by using an F0 model. Evaluation test was carried out on recorded speech of a TV news program. The introduction of prosodic information reduces false alarms by 30% or 50% for wide ranges of the detection rate.
Bibliographic reference. Yamashita, Yoichi / Mizoguchi, Riichiro (1997): "Keyword spotting using F0 contour matching", In EUROSPEECH-1997, 271-274.