Ninth International Conference on Spoken Language Processing

Pittsburgh, PA, USA
September 17-21, 2006

Open-Vocabulary Spoken Document Retrieval Based on New Subword Models and Subword Phonetic Similarity

Kohei Iwata (1), Yoshiaki Itoh (1), Kazunori Kojima (1), Masaaki Ishigame (1), Kazuyo Tanaka (2), Shi-wook Lee (3)

(1) Iwate Prefectural University, Japan; (2) University of Tsukuba, Japan; (3) AIST, Japan

A new type of video retrieval system is proposed that identifies a target video section by searching for a word passage submitted as a quoted speech or text query. The proposed system has two unique characteristics. The first characteristic is that it is based on subword models such as phonemes, syllables, and morphemes so the system is able to deal with any type of query, including new words and personal names. The second characteristic is that the system relies on acoustic similarity between subword models. Furthermore, new subword models were constructed for the retrieval system to improve performance. The new models were based on two concepts: contextdependent models and more sophisticated in the time axis than phone models. Through experimentation, the effectiveness and scope of the proposed spoken document retrieval system were confirmed, and suitable subword models for the proposed method discussed.

Full Paper

Bibliographic reference.  Iwata, Kohei / Itoh, Yoshiaki / Kojima, Kazunori / Ishigame, Masaaki / Tanaka, Kazuyo / Lee, Shi-wook (2006): "Open-vocabulary spoken document retrieval based on new subword models and subword phonetic similarity", In INTERSPEECH-2006, paper 1342-Mon2WeO.2.