Sixth European Conference on Speech Communication and Technology
Spontaneous conversational phone-call speech databases are difficult to recognize because of the large variation of speech rates, of pronunciations as well as noises, of acoustic degradations from the telephone channel, and of an unpredictible non-grammatical language structure including many random phenomena. Each cause of mis-recognition can be addressed separately; however there is still no satisfying solution. As a misrecognition is considered by the system as being a kind of new word, we propose to apply here our keyword spotting and new-word detection technology. However because of the large variety of types of misrecognitions and of the lack of information on where, why and how they occur, we had to define a different language model from those used in previous work. Results show a noticeable recognition improvement, often associated with a decrease in the number of substitutions and a slight increase in the number of the deletions.
Full Paper (PDF) Gnu-Zipped Postscript
Bibliographic reference. El Méliani, Rachida / O’Shaughnessy, Douglas (1999): "Error spotting using syllabic fillers in spontaneous conversational speech recognition", In EUROSPEECH'99, 279-282.