![]() |
ISCA Workshop on Multilingual Speech and Language Processing (MULTILING 2006)Center for Language and Speech Technology, Stellenbosch University, Stellenbosch, South Africa |
![]() |
In this paper a low footprint multilingual text-to-speech (MLTTS) framework is presented. The system is a part of a speaker independent name dialing system that has been introduced in Nokia Series 60 mobile phones. In the ML-TTS systems that are based on the Klatt88 engine there usually exist sets of language specific rules that are used to modify the speech synthesis parameters. Usually, the size of the program code due to the language specific rules becomes large when the number of languages increases. In addition, adding TTS support for a new language is not so easy when the TTS rules are implemented as program code. The development work would require the modifications of the source code, which is always prone to errors and time consuming. The paper presents a novel scheme that both alleviates the memory problems and also makes the language development easier compared to the typical existing solutions. In this framework the language dependent TTS rules are implemented as a scripting language that is stored in text files, one file per each language. The files are converted into a binary form and the rules therefore are implemented as data. With the approach, only the data of the active language needs to be kept in memory and typically the size of a single data file remains small. During synthesis an interpreter is used to process the rules and modify the synthesis parameters accordingly. Moreover, adding TTS support for a new language involves writing the new set of language specific rules and ideally no modifications to the TTS engine code are needed. In addition to the language specific rules, all language dependent information, such as the prosodic model, is stored into the binary file i.e. the language package. Also due to the introduction of the language packages, the TTS engine can be configured to any desired set of languages simply by preparing and providing the associated language packages.
Bibliographic reference. Pärssinen, Kimmo / Moberg, Marko (2006): "Multilingual data configurable text-to-speech system for embedded devices", In MULTILING-2006, paper 016.