11th Annual Conference of the International Speech Communication Association

Makuhari, Chiba, Japan
September 26-30. 2010

Efficient Data Selection for Speech Recognition Based on Prior Confidence Estimation Using Speech and Context Independent Models

Satoshi Kobashikawa, Taichi Asami, Yoshikazu Yamaguchi, Hirokazu Masataki, Satoshi Takahashi

NTT Corporation, Japan

This paper proposes an efficient data selection technique to identify well recognized texts in massive volumes of speech data. Conventional confidence measure techniques can be used to obtain this accurate data, but they require speech recognition results to estimate confidence. Without a significant level of confidence, considerable computer resources are wasted since inaccurate recognition results are generated only to be rejected later. The technique proposed herein rapidly estimates the prior confidence based on just an acoustic likelihood calculation by using speech and context independent models before speech recognition processing; it then recognizes data with high confidence selectively. Simulations show that it matches the data selection performance of the conventional posterior confidence measure with less than 2% of the computation time.

Full Paper

Bibliographic reference.  Kobashikawa, Satoshi / Asami, Taichi / Yamaguchi, Yoshikazu / Masataki, Hirokazu / Takahashi, Satoshi (2010): "Efficient data selection for speech recognition based on prior confidence estimation using speech and context independent models", In INTERSPEECH-2010, 238-241.