8th European Conference on Speech Communication and Technology

Geneva, Switzerland
September 1-4, 2003


Sentence Boundary Detection in Arabic Speech

Amit Srivastava, Francis Kubala

BBN Technologies, USA

This paper presents an automatic system to detect sentence boundaries in speech recognition transcripts. Two systems were developed that use independent sources of information. One is a linguistic system that uses linguistic features in a statistical language model while the other is an acoustic system that uses prosodic features in a feed-forward neural network model. A third system was developed that combines the scores from the acoustic and the linguistic systems in a Maximum-Likelihood framework. All systems outlined in this paper are essentially language-independent but all our experiments were conducted on the Arabic Broadcast News speech recognition transcripts. Our experiments show that while the acoustic system outperforms the linguistic system, the combined system achieves the best performance at detecting sentence boundaries.

Full Paper

Bibliographic reference.  Srivastava, Amit / Kubala, Francis (2003): "Sentence boundary detection in arabic speech", In EUROSPEECH-2003, 949-952.