ISCA Archive Interspeech 2013
ISCA Archive Interspeech 2013

Structure learning in hidden conditional random fields for grapheme-to-phoneme conversion

Patrick Lehnen, Alexandre Allauzen, Thomas Lavergne, Fran├žois Yvon, Stefan Hahn, Hermann Ney

Accurate grapheme-to-phoneme (g2p) conversion is needed for several speech processing applications, such as automatic speech synthesis and recognition. For some languages, notably English, improvements of g2p systems are very slow, due to the intricacy of the associations between letter and sounds. In recent years, several improvements have been obtained either by using variable-length associations in generative models (joint-n-grams), or by recasting the problem as a conventional sequence labeling task, enabling to integrate rich dependencies in discriminative models. In this paper, we consider several ways to reconciliate these two approaches. Introducing hidden variable-length alignments through latent variables, our Hidden Conditional Random Field (HCRF) models are able to produce comparative performance compared to strong generative and discriminative models on the CELEX database.


doi: 10.21437/Interspeech.2013-544

Cite as: Lehnen, P., Allauzen, A., Lavergne, T., Yvon, F., Hahn, S., Ney, H. (2013) Structure learning in hidden conditional random fields for grapheme-to-phoneme conversion. Proc. Interspeech 2013, 2326-2330, doi: 10.21437/Interspeech.2013-544

@inproceedings{lehnen13_interspeech,
  author={Patrick Lehnen and Alexandre Allauzen and Thomas Lavergne and Fran├žois Yvon and Stefan Hahn and Hermann Ney},
  title={{Structure learning in hidden conditional random fields for grapheme-to-phoneme conversion}},
  year=2013,
  booktitle={Proc. Interspeech 2013},
  pages={2326--2330},
  doi={10.21437/Interspeech.2013-544}
}