The Multi- Level Alignment System (MuLAS) is the L2F tool for building multi-tier speech corpora with reduced or no human intervention at all. MuLAS automatically combines information coming from external speech annotations, human or machine-generated, with the text-based utterance descriptions that it creates, in order to build more reliable and complete descriptions of the spoken utterances.
This paper presents our methods for multi-tier annotation synchronization, which lie behind the MuLAS operation. Such methods have allowed us to expand the building of multi-tier corpora to new languages without spending too much effort. MuLAS has been successfully applied to the building of multi-tier corpora for speech synthesis in American and British English, European Portuguese and German. Natural prosody generation has benefited from MuLAS, too, since prosodic models can be derived from corpora built by MuLAS.
Bibliographic reference. Paulo, Sérgio / Oliveira, Luís C. (2007): "MuLAS: a framework for automatically building multi-tier corpora", In INTERSPEECH-2007, 1525-1528.