ISCA Archive PMLA 2002
ISCA Archive PMLA 2002

Modelling phonological rules through linguistic hierarchies

Stephanie Seneff, Chao Wang

This paper describes our research aimed at acquiring a generalized probability model for alternative phonetic realizations in conversational speech. The approach begins with the application of a set of ordered context-dependent phonological rules, applied to the baseforms in the recognizer’s lexicon. The probability model is acquired by observing specific realizations expressed in a large training corpus. A set of context-free rules represents words in terms of a substructure that can then generalize context-dependent probabilities to other words that share the same sub-word context. The model is designed to capture phonetic predictions based on local phonemic, morphologic, and syllabic contexts, thus permitting training on corpora whose lexicon is divergent from that of the intended application. The training corpus consisted of a large set of Jupiter weather-domain speech data [9] augmented with a much smaller set of Mercury flight-domain data [20]. The baseline system utilized the same set of phonological rules for lexical expansion, but with no probability modelling for alternate pronunciations. We evaluated on a test set of utterances exclusively from the flight domain. Using this approach, we achieved a 12.6% reduction in speech understanding error rate on the test set.

Cite as: Seneff, S., Wang, C. (2002) Modelling phonological rules through linguistic hierarchies. Proc. ITRW on Pronunciation Modeling and Lexicon Adaptation for Spoken Language Technology (PMLA 2002), 71-76

  author={Stephanie Seneff and Chao Wang},
  title={{Modelling phonological rules through linguistic hierarchies}},
  booktitle={Proc. ITRW on Pronunciation Modeling and Lexicon Adaptation for Spoken Language Technology (PMLA 2002)},