This article is interested in the problem of the linguistic content of a speech corpus. Depending on the target task (speech recognition, speech synthesis, etc) we try to control the phonological and linguistic content of the corpus by collecting an optimal set of sentences which make it possible to cover a preset description of phonological attributes (prosodic tags, allophones, syllables, etc) under the constraint of a minimal overall duration. This goal is classically achieved by greedy algorithms which however do not guarantee the optimality of the desired cover. We propose to call upon the principle of lagrangian relaxation where a set covering problem is solved by iterating between a primal and a dual spaces. We propose to evaluate our proposed methodology against a standard greedy algorithm in order to estimate an optimal phone and diphone covering in French. Our results show that our algorithm based on a lagrangian relaxation principle gives a 10% better solution than a standard greedy algorithm and especially enables to locate the absolute quality of the proposed solution by giving a lower bound to the set covering problem. According to our experiments, our best solution is only 0.8% far from the lower bound of the phone and diphone covering problem.
Cite as: Chevelu, J., Barbot, N., Boeffard, O., Delhay, A. (2007) Lagrangian relaxation for optimal corpus design. Proc. 6th ISCA Workshop on Speech Synthesis (SSW 6), 211-216
@inproceedings{chevelu07_ssw, author={Jonathan Chevelu and Nelly Barbot and Olivier Boeffard and Arnaud Delhay}, title={{Lagrangian relaxation for optimal corpus design}}, year=2007, booktitle={Proc. 6th ISCA Workshop on Speech Synthesis (SSW 6)}, pages={211--216} }