First European Conference on Speech Communication and Technology

Paris, France
September 27-29, 1989

Disambiguation of the E-Set for Connected-Alphadigit Recognition

Alain J. Vigier (1), Harvey F. Silverman (2)

(1) Thomson CSF 66, Gennevilliers France
(2) Brown University, Division of Engineering, LEMS, Providence, RI, USA

A real-time, talker-dependent, connected-speech recognizer has been operational at the Laboratory for Engineering Man-Machine Systems (LEMS) for 3 years. This recognizer analyses strings of connected digits or alphadigits, using dynamic programming (DP) techniques and an expert for final decision. DP often misclassifies within difficult subgroups of the vocabulary such as the E-set (letters e, p, t, b, d, g, v, z, c). In this paper, we present a feedback mechanism for disambiguation of words classified to be in the E-set by the first pass of the recognizer. This mechanism reanalyses the speech data within an appropriate context and uses the analysis as input to a hidden Markov model (HMM) which uses acoustic, phonetic and linguistic knowledge about the elements of the E-set. Experiments have been run using speech from 5 male talkers. A different model was computed for each talker. Each model was trained with 12 replications of each word and tested with 72 utterances. A recognition rate of 95% was achived.

Full Paper

Bibliographic reference.  Vigier, Alain J. / Silverman, Harvey F. (1989): "Disambiguation of the e-set for connected-alphadigit recognition", In EUROSPEECH-1989, 1021-1024.