ISCA Archive ASIDE 2005
ISCA Archive ASIDE 2005

Effect of poor spontaneous speech modeling on broadcast news transcription performance

Laura Docio-Fernandez, Carmen Garcia-Mateo

Recognition of spontaneous speech is an important area in the field of automatic speech recognition. Although most recognition systems deliver high accuracy on planned speech, they still perform poorly on spontaneous speech. Generally, a news broadcast includes data concerning different speakers with different speech styles (planned, spontaneous and conversational), which is why we chose a Broadcast News (BN) transcription application as our experimental framework. Specifically, we usedTranscrigal, our Broadcast News transcription system, which contains a testset of approximately 19% spontaneous speech. Our recognition results show that even though the BN system delivers high accuracy on planned speech, its performance on spontaneous speech is rather poor. In this paper, we analyze some of the most important features of spontaneous speech, and conclude that recognition performance can be improved both by using as much spontaneous speech training data as possible, and by explicitly modeling spontaneous events, such as filled pauses and breaths, in the decoder.


Cite as: Docio-Fernandez, L., Garcia-Mateo, C. (2005) Effect of poor spontaneous speech modeling on broadcast news transcription performance. Proc. Applied Spoken Language Interaction in Distributed Environments (ASIDE 2005), paper 19

@inproceedings{dociofernandez05_aside,
  author={Laura Docio-Fernandez and Carmen Garcia-Mateo},
  title={{Effect of poor spontaneous speech modeling on broadcast news transcription performance}},
  year=2005,
  booktitle={Proc. Applied Spoken Language Interaction in Distributed Environments (ASIDE 2005)},
  pages={paper 19}
}