ISCA Archive Eurospeech 1991
ISCA Archive Eurospeech 1991

Phoneme duration rules for speech synthesis by neural networks

Matti Karjalainen, Toomas Altosaar

It has been proposed and dernostrated recently that neural networks, instead of conventional discrete rules, can be applied to speech synthesis from text. This paper concentrates primarily on a subtask of text-to-speech speech synthesis, namely the computation of phoneme durations by taking into account the complex contextual information of a phoneme. Possible network-based formulations and data representations are discussed in general and some experimental results are shown to demonstrate the feasibility of the approach. The results are promising and duration computation in the Finnish language according to preliminary experiments performs well compared to rule sets used so far. The more general problem of computing other control parameters of speech synthesis by neural networks is also dicussed shortly. Keywords: Speech synthesis by rule, Neural networks, Prosodic features of speech


doi: 10.21437/Eurospeech.1991-156

Cite as: Karjalainen, M., Altosaar, T. (1991) Phoneme duration rules for speech synthesis by neural networks. Proc. 2nd European Conference on Speech Communication and Technology (Eurospeech 1991), 633-636, doi: 10.21437/Eurospeech.1991-156

@inproceedings{karjalainen91_eurospeech,
  author={Matti Karjalainen and Toomas Altosaar},
  title={{Phoneme duration rules for speech synthesis by neural networks}},
  year=1991,
  booktitle={Proc. 2nd European Conference on Speech Communication and Technology (Eurospeech 1991)},
  pages={633--636},
  doi={10.21437/Eurospeech.1991-156}
}