ISCA Archive ICSLP 1994
ISCA Archive ICSLP 1994

Speech recognition without grammar or vocabulary constraints

Harold Singer, Jun-ichi Takami

Out-of-vocabulary words and ungrammatical utterances are two major problems in speech recognition. We believe that improving the acoustic model is essential in dealing with these problems. We propose to use a 'phonetic typewriter' as an evaluation method. Unlike common approaches, which evaluate acoustic and language model together, this allows direct evaluation of the acoustic model. A comparison of context-independent phone models based on continuous mixture HMM (20 mixtures per state) with context-dependent phone models based on HMnet[4] (3 mixtures per state) showed that phoneme error rate can be halved by using the latter models. The same 'phonetic typewriter' paradigm can also be used directly as a speech recognition method, in which speech is recognized as a string of phonemes without constraints on vocabulary or grammar. We show that over 97 % phoneme recognition accuracy can be achieved if our best acoustic model is used.


Cite as: Singer, H., Takami, J.-i. (1994) Speech recognition without grammar or vocabulary constraints. Proc. 3rd International Conference on Spoken Language Processing (ICSLP 1994), 2207-2210

@inproceedings{singer94_icslp,
  author={Harold Singer and Jun-ichi Takami},
  title={{Speech recognition without grammar or vocabulary constraints}},
  year=1994,
  booktitle={Proc. 3rd International Conference on Spoken Language Processing (ICSLP 1994)},
  pages={2207--2210}
}