ISCA Archive Interspeech 2013
ISCA Archive Interspeech 2013

Pre-initialized composition for large-vocabulary speech recognition

Cyril Allauzen, Michael Riley

This paper describes a modified composition algorithm that is used for combining two finite-state transducers, representing the context-dependent lexicon and the language model respectively, in large vocabulary speech recognition. This algorithm is a hybrid between the static and dynamic expansion of the resultant transducer, which maps from context-dependent phones to words and is searched during decoding. The approach is to pre-compute part of the recognition transducer and leave the balance to be expanded during decoding. This method allows for a fine-grained trade-off between space and time in recognition. For example, the time overhead of purely dynamic expansion can be reduced by over six-fold with only a 20% increase in memory in a collection of large-vocabulary recognition tasks available on the Google Android platform.


doi: 10.21437/Interspeech.2013-190

Cite as: Allauzen, C., Riley, M. (2013) Pre-initialized composition for large-vocabulary speech recognition. Proc. Interspeech 2013, 666-670, doi: 10.21437/Interspeech.2013-190

@inproceedings{allauzen13_interspeech,
  author={Cyril Allauzen and Michael Riley},
  title={{Pre-initialized composition for large-vocabulary speech recognition}},
  year=2013,
  booktitle={Proc. Interspeech 2013},
  pages={666--670},
  doi={10.21437/Interspeech.2013-190}
}