In earlier work, we derived a transducer HC that translates from sequences of Gaussian mixture models directly to phone sequences. The HC transducer was statically expanded then determinized and minimized. In this work, we present a refinement of the correct algorithm whereby the initial
HC transducer is incrementally expanded and immediately determinized. This technique avoids the need for a full expansion of the initial
HC, and thereby reduces the random access memory required to produce the determinized HC by a factor of more than five. With the incremental algorithm, we were able to construct HC for a semi-continuous acoustic model with 16,000 distributions which reduced the word error rate from 34.1% to 32.9% with respect to a fully-continuous system with 4,000 distributions on the lecture meeting portion of the NIST RT05 data.
Cite as: Stoimenov, E., McDonough, J. (2007) Memory efficient modeling of polyphone context with weighted finite-state transducers. Proc. Interspeech 2007, 1457-1460, doi: 10.21437/Interspeech.2007-423
@inproceedings{stoimenov07_interspeech, author={Emilian Stoimenov and John McDonough}, title={{Memory efficient modeling of polyphone context with weighted finite-state transducers}}, year=2007, booktitle={Proc. Interspeech 2007}, pages={1457--1460}, doi={10.21437/Interspeech.2007-423} }