16th Annual Conference of the International Speech Communication Association

Dresden, Germany
September 6-10, 2015

Pruning Redundant Synthesis Units Based on Static and Delta Unit Appearance Frequency

Heng Lu (1), Wei Zhang (1), Xu Shao (1), Quan Zhou (2), Wenhui Lei (2), Hongbin Zhou (2), Andrew Breen (3)

(1) Nuance Communications, USA
(2) Nuance Communications, China
(3) Nuance Communications, UK

In order to reduce the footprint of concatenative speech synthesis systems for embedded devices, a novel method for pruning redundant units is introduced in this work. Instead of using only a unit appearance frequency-based pruning criterion, as in the conventional method, the new method introduces the concept of “delta unit appearance frequency” which indicates whether a unit is replaceable or not. Both static and delta unit appearance frequency are included in this proposed method as pruning criteria. Only units with comparatively high appearance frequency and which cannot be replaced by other units are preserved in the database. Experiments show that the new method can reduce the footprint of our speech synthesis system greatly without losing much synthesis voice quality.

Full Paper

Bibliographic reference.  Lu, Heng / Zhang, Wei / Shao, Xu / Zhou, Quan / Lei, Wenhui / Zhou, Hongbin / Breen, Andrew (2015): "Pruning redundant synthesis units based on static and delta unit appearance frequency", In INTERSPEECH-2015, 269-273.