Second European Conference on Speech Communication and Technology

Genova, Italy
September 24-26, 1991


Context-Sensitive Phoneme Lattice Generation Using Interpolated Demi-Diphone And Triphone Models

Fergus R. McInnes

Centre for Speech Technology Research, University of Edinburgh, UK

Automatic speech recognition systems based on subword units (such as phonemes) can be enhanced by the use of context-specific modelling. This has been applied successfully in top-down recognition systems, in which strong lexical and syntactic constraints limit the number of context-specific units to be modelled. This paper describes a method for applying context-specific modelling in a modular system in which the acoustic-phonetic front end operates independently of vocabulary and syntax. Such a modular system has certain advantages as a research tool, particularly when combined with an entropy measure for evaluation of phoneme lattices. A technique for robust modelling of context-specific units, by interpolation of general and specific probability estimates, is also described. Comparative results are presented which show the improvements due to the context-specific modelling. Keywords: continuous speech recognition, hidden Markov models, triphone modelling, entropy estimation

