In order to reduce the footprint of concatenative speech synthesis systems for embedded devices, a novel method for pruning redundant units is introduced in this work. Instead of using only a unit appearance frequency-based pruning criterion, as in the conventional method, the new method introduces the concept of “delta unit appearance frequency” which indicates whether a unit is replaceable or not. Both static and delta unit appearance frequency are included in this proposed method as pruning criteria. Only units with comparatively high appearance frequency and which cannot be replaced by other units are preserved in the database. Experiments show that the new method can reduce the footprint of our speech synthesis system greatly without losing much synthesis voice quality.
Bibliographic reference. Lu, Heng / Zhang, Wei / Shao, Xu / Zhou, Quan / Lei, Wenhui / Zhou, Hongbin / Breen, Andrew (2015): "Pruning redundant synthesis units based on static and delta unit appearance frequency", In INTERSPEECH-2015, 269-273.