10th Annual Conference of the International Speech Communication Association

Brighton, United Kingdom
September 6-10, 2009

Compacting Discriminative Feature Space Transforms for Embedded Devices

Etienne Marcheret (1), Jia-Yu Chen (2), Petr Fousek (3), Peder A. Olsen (1), Vaibhava Goel (1)

(1) IBM T.J. Watson Research Center, USA
(2) University of Illinois at Urbana-Champaign, USA
(3) IBM Research, Czech Republic

Discriminative training of the feature space using the minimum phone error objective function has been shown to yield remarkable accuracy improvements. These gains, however, come at a high cost of memory. In this paper we present techniques that maintain fMPE performance while reducing the required memory by approximately 94%. This is achieved by designing a quantization methodology which minimizes the error between the true fMPE computation and that produced with the quantized parameters. Also illustrated is a Viterbi search over the allocation of quantization levels, providing a framework for optimal non-uniform allocation of quantization levels over the dimensions of the fMPE feature vector. This provides an additional 8% relative reduction in required memory with no loss in recognition accuracy.

Full Paper

Bibliographic reference.  Marcheret, Etienne / Chen, Jia-Yu / Fousek, Petr / Olsen, Peder A. / Goel, Vaibhava (2009): "Compacting discriminative feature space transforms for embedded devices", In INTERSPEECH-2009, 228-231.