INTERSPEECH 2010
11th Annual Conference of the International Speech Communication Association

Makuhari, Chiba, Japan
September 26-30. 2010

Japanese Spoken Term Detection Using Syllable Transition Network Derived from Multiple Speech Recognizers' Outputs

Satoshi Natori, Hiromitsu Nishizaki, Yoshihiro Sekiguchi

University of Yamanashi, Japan

This paper proposes a spoken term detection using syllable transition network (STN) derived from multiple speech recognizers. An STN is similar to a sub-word based confusion network, which is derived from the output of a speech recognizer. The one we proposed is derived from the outputs of multiple speech recognition systems, which is well known to be robust to certain recognition errors and the out-of-vocabulary problem. Therefore, the STN should also be robust to recognition errors on the STD. This experiment showed that the STN was very effective at detecting out-of-vocabulary terms, improving detection rate to 83%, which was as high as the in-vocabulary term detection performance.

Full Paper

Bibliographic reference.  Natori, Satoshi / Nishizaki, Hiromitsu / Sekiguchi, Yoshihiro (2010): "Japanese spoken term detection using syllable transition network derived from multiple speech recognizers' outputs", In INTERSPEECH-2010, 681-684.