15th Annual Conference of the International Speech Communication Association

September 14-18, 2014

Structured Soft Margin Confidence Weighted Learning for Grapheme-to-Phoneme Conversion

Keigo Kubo, Sakriani Sakti, Graham Neubig, Tomoki Toda, Satoshi Nakamura

NAIST, Japan

In recent years, structured online discriminative learning methods using second order statistics have been shown to outperform conventional generative and discriminative models in the grapheme-to-phoneme (g2p) conversion task. However, these methods update the parameters by sequentially using N-best hypotheses predicted with the current parameters. Thus, the parameters appearing in early hypotheses are overfitted compared with those in later hypotheses. In this paper, we propose a novel method called structured soft margin confidence weighted learning, which extends multi-class confidence weighted learning to structured learning. The proposed method extends multi-class CW in two ways, allowing for improved robustness to overfitting: (1) regularization inspired by soft margin support vector machines, allowing for margin error, and (2) update using N-best hypotheses simultaneously and interdependently. In an evaluation experiment on the g2p conversion task, the proposed method improved over all other approaches in terms of phoneme error rate with a significant difference.

Full Paper

Bibliographic reference.  Kubo, Keigo / Sakti, Sakriani / Neubig, Graham / Toda, Tomoki / Nakamura, Satoshi (2014): "Structured soft margin confidence weighted learning for grapheme-to-phoneme conversion", In INTERSPEECH-2014, 1263-1267.