This paper proposes a new speech modification algorithm based on a vocoder framework to synthesize high quality speech. Its innovation is in preserving the fine structure of the magnitude spectrum. A key point is the use of a “compensatory gaussian window" to extract moderate F0 harmonics structures in the magnitude spectrum. The other key point is, starting from the magnitude spectrum, generating the F0 harmonics structures that match the target's fundamental frequency. Preference tests show that the proposed algorithm synthesizes higher quality speech than TD-PSOLA if large prosody modification is needed, and that the spectral envelope produced by the proposed algorithm is superior to any other conventional vocoders, especially when modifying the frequency upward.
Cite as: Takano, S., Abe, M. (1999) A new F0 modification algorithm by manipulating harmonics of magnitude spectrum. Proc. 6th European Conference on Speech Communication and Technology (Eurospeech 1999), 1875-1878, doi: 10.21437/Eurospeech.1999-410
@inproceedings{takano99_eurospeech, author={Satoshi Takano and Masanobu Abe}, title={{A new F0 modification algorithm by manipulating harmonics of magnitude spectrum}}, year=1999, booktitle={Proc. 6th European Conference on Speech Communication and Technology (Eurospeech 1999)}, pages={1875--1878}, doi={10.21437/Eurospeech.1999-410} }