HTK and Sphinx are two freely downloadable software packages with the capability of implementing a large vocabulary, speaker independent, continuous speech recognition system in any language. While HTK has been in use by various groups for about a decade, and has gone through the refinement cycles necessary for a commercial software, Sphinx was released about a year ago and is still undergoing development in a university environment. However, due to certain advanced features and the license for unrestricted use, Sphinx appears to be more attractive. These two software packages have been compared by implementing a Hindi speech recognition system. Although recognition accuracies of the two systems are comparable, we observe that the acoustic modeling of Sphinx is superior.
Cite as: Samudravijaya, K., Barot, M. (2003) A Comparison of Public-Domain Software Tools for Speech Recognition. Proc. Workshop on Spoken Language Processing, 125-131
@inproceedings{samudravijaya03_wslp, author={K. Samudravijaya and Maria Barot}, title={{A Comparison of Public-Domain Software Tools for Speech Recognition}}, year=2003, booktitle={Proc. Workshop on Spoken Language Processing}, pages={125--131} }