8th European Conference on Speech Communication and Technology

Geneva, Switzerland
September 1-4, 2003


Independent Automatic Segmentation by Self-Learning Categorial Pronunciation Rules

N. Beringer

Ludwig-Maximilians-Universität München , Germany

The goal of this paper is to present a new method to automatically generate pronunciation rules for automatic segmentation of speech - the German MAUSER system. MAUSER is an algorithm which generates pronunciation rules independently of any domain dependent training data either by clustering and statistically weighting self-learned rules according to a small set of phonological rules clustered by categories or by re-weighting "seen"' phonological rules. By this method we are able to automatically segment cost-effectively large corpora of mainly unprompted speech.

Full Paper

Bibliographic reference.  Beringer, N. (2003): "Independent automatic segmentation by self-learning categorial pronunciation rules", In EUROSPEECH-2003, 785-788.