5th International Conference on Spoken Language Processing

Sydney, Australia
November 30 - December 4, 1998

Frequency Analysis of Phonetic Units for Concatenative Synthesis in Catalan

Ignasi Esquerra, Albert Febrer, Climent Nadeu

Universitat Politecnica de Catalunya, Spain

Knowledge of phonetic unit frequency is very necessary for developing databases in both concatenative synthesis and continuous speech recognition. In the present work, a large corpus of text was processed and phonetically transcribed to obtain allophone and diphone frequencies for the Catalan language. The corpus was acquired from newspaper articles, in which there were a lot of foreign words that represented a problem in the normalisation process. After automatic transcription, units were counted to get their relative frequency and results were compared to other analysis. Finally, diphones found in the corpus were compared to units of a synthesis database to validate both the normalisation and transcription modules and the synthesis unit database.

