ISCA Archive ICSLP 1998
ISCA Archive ICSLP 1998

A mixed-excitation frequency domain model for time-scale pitch-scale modification of speech

Alex Acero

This paper presents a time-scale pitch-scale modification technique for concatenative speech synthesis. The method is based on a frequency domain source-filter model, where the source is modeled as a mixed excitation. This model is highly coupled with a compression scheme that result in compact acoustic inventories. When compared to the approach in the Whistler system using no mixed excitation, the new method shows improvement in voiced fricatives and over-stretched voiced sounds. In addition, it allows for spectral manipulation such as smoothing of discontinuities at unit boundaries, voice transformations or loudness equalization.


doi: 10.21437/ICSLP.1998-16

Cite as: Acero, A. (1998) A mixed-excitation frequency domain model for time-scale pitch-scale modification of speech. Proc. 5th International Conference on Spoken Language Processing (ICSLP 1998), paper 0072, doi: 10.21437/ICSLP.1998-16

@inproceedings{acero98_icslp,
  author={Alex Acero},
  title={{A mixed-excitation frequency domain model for time-scale pitch-scale modification of speech}},
  year=1998,
  booktitle={Proc. 5th International Conference on Spoken Language Processing (ICSLP 1998)},
  pages={paper 0072},
  doi={10.21437/ICSLP.1998-16}
}