Triphone acoustic models are often used as subword models for detecting out-of-vocabulary query terms in Spoken Term Detection (STD) systems. Our preliminary experiments revealed that the training data for a large portion of the approximately 8,000 triphone models are insufficient. Assuming that such insufficient models deteriorate the performance of STD, this paper proposes intensive triphone models constructed by integrating low-occurrence triphone models into high-occurrence ones. Experiments conducted using an actual lecture speech corpus showed that the proposed method improves the STD performance with regard to both triphones and demiphones, demonstrating its effectiveness.
Bibliographic reference. Narumi, Shiro / Konno, Kazuma / Nakano, Takuya / Itoh, Yoshiaki / Kojima, Kazunori / Ishigame, Masaaki / Tanaka, Kazuyo / Lee, Shi-wook (2013): "Intensive acoustic models constructed by integrating low-occurrence models for spoken term detection", In INTERSPEECH-2013, 25-28.