This paper proposes the use of Quantized HiddenMarkovModels (QHMMs) for reducing the footprint of conventional parametric HMM-based TTS system. Previously, this technique was successfully applied to automatic speech recognition in embedded devices without loss of recognition performance. In this paper we investigate the construction of different quantized HMM configurations that serve as input to the standard ML-based parameter generation algorithm. We use both subjective and objective tests to compare the resulting systems. Subjective results for specific compression configurations show no significant preference although some spectral distortion is reported. We conclude that a trade-off is necessary in order to satisfy both speech quality and low-footprint memory requirements.
Bibliographic reference. Gutkin, Alexander / Gonzalvo, Xavi / Breuer, Stefan / Taylor, Paul (2010): "Quantized HMMs for low footprint text-to-speech synthesis", In INTERSPEECH-2010, 837-840.