14thAnnual Conference of the International Speech Communication Association

Lyon, France
August 25-29, 2013

Grapheme-to-Phoneme Conversion Based on Adaptive Regularization of Weight Vectors

Keigo Kubo, Sakriani Sakti, Graham Neubig, Tomoki Toda, Satoshi Nakamura

NAIST, Japan

The current state-of-the-art approach in grapheme-to-phoneme (g2p) conversion is structured learning based on the Margin Infused Relaxed Algorithm (MIRA), which is an online discriminative training method for multiclass classification. However, it is known that the aggressive weight update method of MIRA is prone to overfitting, even if the current example is an outlier or noisy. Adaptive Regularization of Weight Vectors (AROW) has been proposed to resolve this problem for binary classification. In addition, AROW's update rule is simpler and more efficient than that of MIRA, allowing for more efficient training. Although AROW has these advantages, it has not been applied to g2p conversion yet. In this paper, we first apply AROW to g2p conversion which is structured learning problem. In an evaluation that employed a dataset including noisy data our proposed approach achieves a 5.3% error reduction rate compared to MIRA implemented in DirecTL+ in terms of phoneme error rate while requiring only 78% the training time.

Full Paper

Bibliographic reference.  Kubo, Keigo / Sakti, Sakriani / Neubig, Graham / Toda, Tomoki / Nakamura, Satoshi (2013): "Grapheme-to-phoneme conversion based on adaptive regularization of weight vectors", In INTERSPEECH-2013, 1946-1950.