ISCA Archive ICSLP 2000
ISCA Archive ICSLP 2000

A hybrid approach for grapheme-to-phoneme conversion based on a combination of partial string matching and a neural network

Horst-Udo Hain

The quality of a text-to-speech (TTS) system heavily depends on the transcription quality of the words to be spoken. Obviously the best transcription can be found in a phonetic dictionary. But for out of vocabulary (OOV) words fall back routines have to be developed.

This paper proposes a fall back routine that combines the correctness of a phonetic dictionary with the flexibility of a neural network. In the first step parts of the OOV word are looked up in the dictionary. They are then connected with the additional feature that the last phoneme of the first part is re-estimated using a neural network and a special phonetic dictionary. In the second step the word stress is determined either from the dictionary or using a second neural network.


Cite as: Hain, H.-U. (2000) A hybrid approach for grapheme-to-phoneme conversion based on a combination of partial string matching and a neural network. Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000), vol. 3, 291-294

@inproceedings{hain00_icslp,
  author={Horst-Udo Hain},
  title={{A hybrid approach for grapheme-to-phoneme conversion based on a combination of partial string matching and a neural network}},
  year=2000,
  booktitle={Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000)},
  pages={vol. 3, 291-294}
}