Second European Conference on Speech Communication and Technology

Genova, Italy
September 24-26, 1991


Automatic Learning of Lexical Representations for Sub-Word Unit Based Speech Recognition Systems

Michael Phillips, James Glass, Victor W. Zue

Spoken Language Systems Group, Laboratory for Computer Science, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA

In 1989, we first reported on the development of summit, a segment-based speaker-independent continuous-speech recognition system [12]. The initial version of summit made use of a small set of context-independent models for the lexical labels. In this paper, we describe our recent attempts to develop a framework that can produce an arbitrarily complex lexical representation. The procedure should permit us to achieve simultaneously the goals of determining a set of context-dependent labels and a lexical network representing alternate pronunciations of the words in our lexicon. Our experiments thus far have been conducted independently on two separate recognition tasks. In both cases, a significant reduction in recognition error rate has been realized.

