![]() |
EUROSPEECH 2003 - INTERSPEECH 2003
|
![]() |
This paper addresses compound splitting for Dutch in the context of broadcast news transcription. Language models were created using original text versions and text versions that were decomposed using a data-driven compound splitting algorithm. Language model performances were compared in terms of out-of- vocabulary rates and word error rates in a real-world broadcast news transcription task. It was concluded that compound splitting does improve ASR performance. Best results were obtained when frequent compounds were not decomposed.
Bibliographic reference. Ordelman, Roeland / Hessen, Arjan van / Jong, Franciska de (2003): "Compound decomposition in dutch large vocabulary speech recognition", In EUROSPEECH-2003, 225-228.