ISCA Archive Interspeech 2013
ISCA Archive Interspeech 2013

Effect of MPEG audio compression on HMM-based speech synthesis

Bajibabu Bollepalli, Tuomo Raitio, Paavo Alku

In this paper, the effect of MPEG audio compression on HMM-based speech synthesis is studied. Speech signals are encoded with various compression rates and analyzed using the GlottHMM vocoder. Objective evaluation results show that the vocoder parameters start to degrade from encoding with bit-rates of 32 kbit/s or less, which is also confirmed by the subjective evaluation of the vocoder analysis-synthesis quality. Experiments with HMM-based speech synthesis show that the subjective quality of a synthetic voice trained with 32 kbit/s speech is comparable to a voice trained with uncompressed speech, but lower bit rates induce clear degradation in quality.


doi: 10.21437/Interspeech.2013-119

Cite as: Bollepalli, B., Raitio, T., Alku, P. (2013) Effect of MPEG audio compression on HMM-based speech synthesis. Proc. Interspeech 2013, 1062-1066, doi: 10.21437/Interspeech.2013-119

@inproceedings{bollepalli13_interspeech,
  author={Bajibabu Bollepalli and Tuomo Raitio and Paavo Alku},
  title={{Effect of MPEG audio compression on HMM-based speech synthesis}},
  year=2013,
  booktitle={Proc. Interspeech 2013},
  pages={1062--1066},
  doi={10.21437/Interspeech.2013-119}
}