5th European Conference on Speech Communication and Technology

Rhodes, Greece
September 22-25, 1997

Improvements on a Trainable Letter-To-Sound Converter

Li Jiang, Hsiao-Wuen Hon, Xuedong Huang

Microsoft Research, Redmond, WA, USA

Letter-to-sound (LTS) conversion is important for both text-to-speech (TTS) and automatic speech recognition (ASR). In this paper we discuss some improvements we have made on our trainable LTS converter. We use a classification and regression tree (CART) to automatically configure the most salient phonological rules needed for the LTS conversion. We address problems in growing multiple trees and use of phonotactic information for better generalization. The experiments were carried on both the NETTALK database and the CMU dictionary. With improved techniques, the conversion error rate at the phoneme level and word level was reduced by 15% and 20% respectively. For both tasks, the phoneme conversion error rate was reduced to about 8%.

Full Paper

Bibliographic reference.  Jiang, Li / Hon, Hsiao-Wuen / Huang, Xuedong (1997): "Improvements on a trainable letter-to-sound converter", In EUROSPEECH-1997, 605-608.