15th Annual Conference of the International Speech Communication Association

September 14-18, 2014

Encoding Linear Models as Weighted Finite-State Transducers

Ke Wu (1), Cyril Allauzen (2), Keith Hall (2), Michael Riley (2), Brian Roark (2)

(1) University of Maryland, USA
(2) Google, USA

We present algorithms, implemented as an extension to the OpenFst library, that yield a class of transducers that encode linear models for structured inference tasks like segmentation and tagging. This allows the use of general finite-state operations with such models. For instance, finite-state composition can be used to apply the model to lattice input (or other more general automata) and then the result automaton can be passed to subsequent processing such as general shortest path algorithms. We demonstrate the use of the library extension on grapheme-to-phoneme conversion, encoding multiple varieties of linear models for that task, and achieve solid PER/WER gains over previous best reported results on g2p conversion of a publicly available dataset (CMU).

Full Paper

Bibliographic reference.  Wu, Ke / Allauzen, Cyril / Hall, Keith / Riley, Michael / Roark, Brian (2014): "Encoding linear models as weighted finite-state transducers", In INTERSPEECH-2014, 1258-1262.