8th International Conference on Spoken Language Processing

Jeju Island, Korea
October 4-8, 2004

Towards Automatic Word Segmentation of Dialect Speech

Eric Sanders (1), Andrea Diersen (1), Willy Jongenburger (2), Helmer Strik (1)

(1) Radboud University Nijmegen, Netherlands
(2) Meertens Instituut, Netherlands

This paper is about the creation of a digital dialect database, and the focus is on automatic word segmentation. Automatic word segmentation has been studied by several research groups during the last two decades. However, the task we are faced with differs in several respects from previous ones. For instance, in our case we are dealing with recordings of interviews containing spontaneous dialect speech and 'enriched' (quasi-phonetic) orthographic transcriptions (instead of 'normal' orthographic transcriptions, which are usually available). Furthermore, the nature of the task requires that the word segmentation procedure can be adapted for each interview.

Full Paper

Bibliographic reference.  Sanders, Eric / Diersen, Andrea / Jongenburger, Willy / Strik, Helmer (2004): "Towards automatic word segmentation of dialect speech", In INTERSPEECH-2004, 2745-2748.