8th Annual Conference of the International Speech Communication Association

Antwerp, Belgium
August 27-31, 2007

Memory Efficient Modeling of Polyphone Context with Weighted Finite-State Transducers

Emilian Stoimenov (1), John McDonough (2)

(1) Universität Karlsruhe (TH), Germany
(2) Saarland University, Germany

In earlier work, we derived a transducer HC that translates from sequences of Gaussian mixture models directly to phone sequences. The HC transducer was statically expanded then determinized and minimized. In this work, we present a refinement of the correct algorithm whereby the initial

HC transducer is incrementally expanded and immediately determinized. This technique avoids the need for a full expansion of the initial

HC, and thereby reduces the random access memory required to produce the determinized HC by a factor of more than five. With the incremental algorithm, we were able to construct HC for a semi-continuous acoustic model with 16,000 distributions which reduced the word error rate from 34.1% to 32.9% with respect to a fully-continuous system with 4,000 distributions on the lecture meeting portion of the NIST RT05 data.

Full Paper

Bibliographic reference.  Stoimenov, Emilian / McDonough, John (2007): "Memory efficient modeling of polyphone context with weighted finite-state transducers", In INTERSPEECH-2007, 1457-1460.