11th Annual Conference of the International Speech Communication Association

Makuhari, Chiba, Japan
September 26-30. 2010

Large Vocabulary Continuous Speech Recognition Using WFST-Based Linear Classifier for Structured Data

Shinji Watanabe, Takaaki Hori, Atsushi Nakamura

NTT Corporation, Japan

This paper describes a discriminative approach that further advances the framework for Weighted Finite State Transducer (WFST) based decoding. The approach introduces additional linear models for adjusting the scores of a decoding graph composed of conventional information source models, and reviews the WFST-based decoding process as a linear classifier for structured data. The difficulty with the approach is that the number of dimensions of the additional linear models becomes very large in proportion to the number of arcs in a WFST, and our previous study only applied it to a small task. This paper proposes a training method for a large-scale linear classifier employed in WFST-based decoding by using a distributed perceptron algorithm. The experimental results show that the proposed approach was successfully applied to a large vocabulary continuous speech recognition task, and achieved an improvement compared with the performance of the discriminative training of acoustic models.

Full Paper

Bibliographic reference.  Watanabe, Shinji / Hori, Takaaki / Nakamura, Atsushi (2010): "Large vocabulary continuous speech recognition using WFST-based linear classifier for structured data", In INTERSPEECH-2010, 346-349.