Sixth European Conference on Speech Communication and Technology

Budapest, Hungary
September 5-9, 1999

Acquisition of an Extensive Rule Set for Slovene Grapheme-to-Allophone Transcription

Jerneja Gros, F. Mihelic

Artificial Perception Laboratory, Faculty of Electrical Engineering, University of Ljubljana, Slovenia

An extensive rule set for grapheme-to-allophone conversion of Slovene texts has been defined and evaluated. Another rule set has been developed for pronunciation of names. The efficiency of both S5 rule sets was compared to the one of the Onomastica rule set on two manually transcribed test data sets. A performance test applied on the S5 pronunciation dictionary showed error rates of about 30% in the stress assignment and consequently in the phonetic transcription. In case stress assignment and the transcriptions of graphemic /e/ and /o/ in stressed syllables had been marked in advance a transcription success rate of nearly 100% was achieved both on names and on standard words with the S5 names rule sets and the S5 standard words rule set respectively.

