5th International Conference on Spoken Language Processing
Many automatic speech recognition telephony applications involve recognition of input containing some type of numbers. Traditionally, this has been achieved by using isolated or connected digit recognizers. However, as speech recognition finds a wider range of applications, it is often infeasible to impose restrictions on speaker behavior. This paper studies two model topologies for natural number recognition which use minimum classification error (MCE) trained inter-word context dependent acoustic models. One model topology uses triphone context units while another is of the head-body-tail (HBT) type. The performance of the models is evaluated on three natural number applications involving recognition of dates, time of day, and dollar amounts. Experimental results show that context dependent models reduce string error rates by as much as 50% over baseline context independent whole-word models. String accuracies of about 93% are obtained on these tasks while at the same time allowing users flexibility in speaking styles.
Bibliographic reference. Gandhi, Malan B. (1998): "Natural number recognition using discriminatively trained inter-word context dependent hidden Markov models", In ICSLP-1998, paper 0090.