Robust LTS rules with the Combilex speech technology lexicon

Korin Richmond, Robert A. J. Clark, Sue Fitt

Combilex is a high quality pronunciation lexicon, aimed at speech technology applications, that has recently been released by CSTR. Combilex benefits from several advanced features. This paper evaluates one of these: the explicit alignment of phones to graphemes in a word. This alignment can help to rapidly develop robust and accurate letter-to-sound (LTS) rules, without needing to rely on automatic alignment methods. To evaluate this, we used Festival’s LTS module, comparing its standard automatic alignment with Combilex’s explicit alignment. Our results show using Combilex’s alignment improves LTS accuracy: 86.50% words correct as opposed to 84.49%, with our most general form of lexicon. In addition, building LTS models is greatly accelerated, as the need to list allowed alignments is removed. Finally, loose comparison with other studies indicates Combilex is a superior quality lexicon in terms of consistency and size.

doi: 10.21437/Interspeech.2009-405

Cite as: Richmond, K., Clark, R.A.J., Fitt, S. (2009) Robust LTS rules with the Combilex speech technology lexicon. Proc. Interspeech 2009, 1295-1298, doi: 10.21437/Interspeech.2009-405

