Conditional Random Fields have been successfully applied to a number of NLP tasks like concept tagging, named entity tagging, or graphemeto- phoneme conversion. When no alignment between source and target side is provided with the training data, it is challenging to build a CRF system with state-of-the-art performance. In this work, we present an approach incorporating an M-to-N alignment as a hidden variable within a transducer-based implementation of CRFs. Including integrated estimation of transition penalties, it was possible to train a state-of-the-art hidden CRF system in reasonable time for an English grapheme-to-phoneme conversion task without using an external model to provide the alignment.
Index Terms: CRF, G2P, Alignment, M-N
Bibliographic reference. Lehnen, Patrick / Hahn, Stefan / Guta, Vlad-Andrei / Ney, Hermann (2012): "Hidden conditional random fields with M-to-N alignments for grapheme-to-phoneme conversion", In INTERSPEECH-2012, 2554-2557.