14thAnnual Conference of the International Speech Communication Association

Lyon, France
August 25-29, 2013

Intensive Acoustic Models Constructed by Integrating Low-Occurrence Models for Spoken Term Detection

Shiro Narumi (1), Kazuma Konno (1), Takuya Nakano (1), Yoshiaki Itoh (1), Kazunori Kojima (1), Masaaki Ishigame (1), Kazuyo Tanaka (2), Shi-wook Lee (3)

(1) Iwate Prefectural University, Japan
(2) Tsukuba University, Japan
(3) AIST, Japan

Triphone acoustic models are often used as subword models for detecting out-of-vocabulary query terms in Spoken Term Detection (STD) systems. Our preliminary experiments revealed that the training data for a large portion of the approximately 8,000 triphone models are insufficient. Assuming that such insufficient models deteriorate the performance of STD, this paper proposes intensive triphone models constructed by integrating low-occurrence triphone models into high-occurrence ones. Experiments conducted using an actual lecture speech corpus showed that the proposed method improves the STD performance with regard to both triphones and demiphones, demonstrating its effectiveness.

Full Paper

Bibliographic reference.  Narumi, Shiro / Konno, Kazuma / Nakano, Takuya / Itoh, Yoshiaki / Kojima, Kazunori / Ishigame, Masaaki / Tanaka, Kazuyo / Lee, Shi-wook (2013): "Intensive acoustic models constructed by integrating low-occurrence models for spoken term detection", In INTERSPEECH-2013, 25-28.