2nd Workshop on Spoken Language Technologies for Under-Resourced Languages

Universiti Sains, Penang, Malaysia
May 3-5, 2010

Modeling of Geminate Duration in an Amharic Text-to-Speech Synthesis System

Tadesse Anberbir (1), Tomio Takara (2), Dong Yoon Kim (1)

(1) Department of Computer Engineering, Ajou University, Suwon, Korea
(2) Department of Information Engineering, University of the Ryukyus, Okinawa, Japan

This paper presents analysis and modeling of geminate duration in Amharic Text-to-Speech (AmhTTS) synthesis system. AmhTTS is a parametric and rule-based system that employs a cepstral method. The system uses a source filter model for speech production and a Log Magnitude Approximation (LMA) filter as the vocal tract filter. Fundamental speech units of the system are syllables. Gemination in Amharic is one of the distinctive features of the language which plays a crucial role for the naturalness of synthesized speech sound. Therefore, in our study we mainly consider geminates and models the duration in AmhTTS system. The effectiveness of the durational model employed in our system was evaluated using 200 words (of which 40% of words containing one or more geminated syllables and 75% of the words containing sixth order syllables) and 5 sentences (with one or more words with geminated syllables) and we found promising results. The listening test results showed that accurate estimation of geminates duration is crucial for intelligibility and natural sounding of AmhTTS system. Our modeling greatly improved the intelligibility and naturalness of the system.

Index Terms: Amharic, geminates, speech synthesis, duration, cepstrum.

Full Paper

Bibliographic reference.  Anberbir, Tadesse / Takara, Tomio / Kim, Dong Yoon (2010): "Modeling of geminate duration in an amharic text-to-speech synthesis system", In SLTU-2010, 122-129.