This paper presents an approach for complex cepstrum analysis based on the minimum mean squared error criterion, and describes its application to statistical parametric speech synthesis. The proposed method alleviates some of the issues associated with conventional complex cepstrum analysis, such as choice of the window, phase unwrapping, and the need for accurate pitch marks. Given initial estimates of warped complex cepstra and respective analysis instants, the method iteratively optimizes the complex cepstrum on a warped quefrency domain by minimizing the mean squared error between the natural and the reconstructed speech waveforms. When applied to statistical parametric speech synthesis, the optimized complex cepstrum results in better performance in terms of synthesized speech quality, specially for emotional databases, when compared with the complex cepstrum calculated through conventional methods.
Bibliographic reference. Maia, Ranniery / Gales, M. J. F. / Stylianou, Yannis / Akamine, Masami (2013): "Minimum mean squared error based warped complex cepstrum analysis for statistical parametric speech synthesis", In INTERSPEECH-2013, 2336-2340.