Third International Conference on Spoken Language Processing (ICSLP 94)
In conventional systems for word spotting or speech recognition, individual frames or segments of the input speech are assigned labels and local likelihood scores solely on the basis of their own acoustic characteristics. On the other hand, human perception of speech segments depends heavily on their acoustic and linguistic context. The present paper presents a new method of word spotting in continuous speech based on template matching where the likelihood score of each segment of a word is determined not only by its own characteristics but also by the likelihood of its context within the frame-work of a word. The advantage of the proposed method over conventional methods is demonstrated by a word-spotting experiment on a limited number of samples of connected speech of Japanese.
Bibliographic reference. Ohno, Sumio / Fujisaki, Hiroya / Hirose, Keikichi (1994): "A method for word spotting in continuous speech using both segmental and contextual likelihood scores", In ICSLP-1994, 2199-2202.