Sixth European Conference on Speech Communication and Technology
Relations between non-adjacent parts of an utterance are commonly regarded as an important source of information for speech recognition. However, they have not been very much used in speech recognition systems. In this paper, we include this information by joint distributions of pairs of phones occurring in the same utterance. In addition to relations between acoustic events, we also have incorporated relations between spectral and prosodically oriented information, such as phone duration, position in utterance and funda-mental frequency. Preliminary recognition results on N-best rescoring show 10% word error reduction compared to a baseline Viterbi decoder.
Full Paper (PDF) Gnu-Zipped Postscript
Bibliographic reference. Blomberg, Mats (1999): "Within-utterance correlation for speech recognition", In EUROSPEECH'99, 2479-2482.