Third European Conference on Speech Communication and Technology

Berlin, Germany
September 22-25, 1993


Reversible Letter-to-Sound Sound-to-Letter Generation Based on Parsing Word Morphology

Sheri Hunnicutt, Helen Meng, Stephanie Seneff, Victor W. Zue

Spoken Language Systems Group, Laboratory for Computer Science, M.I. T., Cambridge, MA, USA

This paper describes a reversible letter-tosound/sound-toletter system based on a strategy that combines data-driven techniques with a rule-based formalism. Our approach is to provide a hierarchical analysis of a word, including information such as stress pat- tern, morphology and syllabification, which incorporates probabilities that are trained from a parsed lexicon. Our training and testing corpora consisted of spellings and pronunciations for the high frequency portion of the Brown Corpus (10,000 words). We augmented the phonetic labels with markers indicating morphology and stress. We report here on two distinct grammars representing a historical perspective. Our early work with the first grammar inspired us to modify the grammar formalism, leading to greater constraint with fewer rules. We evaluated our performance on letter-to-sound generation in terms of whole word accuracy as well as phoneme accuracy. For the unseen test set, we achieved a word accuracy of 69.3% and a phone accuracy of 91.7% using a set of 49 distinct phonemes. Although we have no formal results on sound-to-letter generation, we believe that this formalism will be applicable for entering unknown words orally into a recognition system.

Full Paper

Bibliographic reference.  Hunnicutt, Sheri / Meng, Helen / Seneff, Stephanie / Zue, Victor W. (1993): "Reversible letter-to-sound sound-to-letter generation based on parsing word morphology", In EUROSPEECH'93, 763-766.