Sixth European Conference on Speech Communication and Technology
The lack of freely available state-of-the-art Speech-to-Text (STT) software has been a major hindrance to the development of new audio information processing technology. The high cost of the infrastructure required to conduct state-of-the-art speech recognition research prevents many small research groups from evaluating new ideas on large-scale tasks. In this paper, we present the core components of an available state-of-the-art STT system: an acoustic processor which converts the speech signal into a sequence of feature vectors; a training module which estimates the parameters for a Hidden Markov Model; a linguistic processor which predicts the next word given a sequence of previously recognized words; and a search engine which finds the most probable word sequence given a set of feature vectors.
Full Paper (PDF) Gnu-Zipped Postscript
Bibliographic reference. Ordowski, M. / Deshmukh, N. / Ganapathiraju, A. / Hamaker, J. / Picone, Joseph (1999): "A public domain speech-to-text system", In EUROSPEECH'99, 2127-2130.