This paper proposes a novel scheme for keyword-spotting in conversational speech using frame-level likelihood of phonemes and statistics of their duration. Since spontaneous utterances include many ill-formed sentences, it is most difficult to realize a highly advanced continuous speech recognition system based on a top-down syntax driven process. We, therefore, propose a bottom-up method to detect keywords in continuous speech based on a dynamical programming technique using both phonemic and durational likelihood. Our algorithm basically depends on island-driven both-side-free DP method. In the performance test of the speaker-dependent keyword spotting, it was found that, compared to the conventional continuous DP method, the erroneous candidates and the processing time decreases to 1/6 in new method. This result shows the feasibility of our method for continuous speech recognition, especially for conversational style utterances.
Bibliographic reference. Okawa, Shigeki / Kobayashi, Tetsunori / Shirai, Katsuhiko (1993): "Word spotting in conversational speech based on phonemic unit likelihood by mutual information criterion", In EUROSPEECH'93, 1281-1284.