This paper introduces a novel approach for generating multilingual text-to-phoneme mappings for use in multilingual speech recognition systems. The multilingual mappings are based on the weighted outputs from a neural network text-to-phoneme model, trained on data mixed from several languages. The multilingual mappings used together with a branched grammar decoding scheme is able to capture both inter- and intra-language pronunciation variations which is ideal for multilingual speaker independent speech recognition systems. A significant improvement in overall system performance was obtained for a multilingual speaker independent name dialing task when applying multilingual instead of language dependent text-to-phoneme mapping.
Cite as: Riis, S.K., Pedersen, M.W., Jensen, K.J. (2001) Multilingual text-to-phoneme mapping. Proc. 7th European Conference on Speech Communication and Technology (Eurospeech 2001), 1441-1444, doi: 10.21437/Eurospeech.2001-24
@inproceedings{riis01_eurospeech, author={Søren Kamaric Riis and Morten With Pedersen and Kare Jean Jensen}, title={{Multilingual text-to-phoneme mapping}}, year=2001, booktitle={Proc. 7th European Conference on Speech Communication and Technology (Eurospeech 2001)}, pages={1441--1444}, doi={10.21437/Eurospeech.2001-24} }