ISCA Archive Eurospeech 1991
ISCA Archive Eurospeech 1991

English alphabet recognition with telephone speech

Ronald Cole, Krist Roginski, Mark Fanty

We describe database and system development for speaker-independent recognition of telephone speech. The telephone speech database contains about 4,000 callers from the USA and Canada each of whom provided several utterances, including city names, first and last names, spelled names, and answers to yes/no questions. About 1,000 of the callers recited the English alphabet with pauses between letters. A portion of the database has been verified and phonetically labeled, and this portion was used to develop a baseline system that recognizes names spelled with pauses between letters. The system uses a neural network to segment speech into a sequence of 24 phonetic categories. The phonetic categories are used to hypothesize a sequence of letters which are then reclassified using a second neural network. First choice letter recognition accuracy was 87. 6% in the best condition. First choice name retrieval was S5. 5% for 200 spelled names retrieved from a database of 50,000 common last names.

doi: 10.21437/Eurospeech.1991-120

Cite as: Cole, R., Roginski, K., Fanty, M. (1991) English alphabet recognition with telephone speech. Proc. 2nd European Conference on Speech Communication and Technology (Eurospeech 1991), 479-482, doi: 10.21437/Eurospeech.1991-120

  author={Ronald Cole and Krist Roginski and Mark Fanty},
  title={{English alphabet recognition with telephone speech}},
  booktitle={Proc. 2nd European Conference on Speech Communication and Technology (Eurospeech 1991)},