The paper addresses the problem of designing a language independent phonetic inventory for the speech recognisers with multilingual vocabulary. A new clustering algorithm for the definition of multilingual set of triphones is proposed. The clustering algorithm bases on a definition of a distance measure for triphones defined as a weighted sum of explicit estimates of the context similarity on a monophone level. The monophone similarity estimation method based on the algorithm of Houtgast. The clustering algorithm is integrated in a multilingual speech recognition system based on HTK V2.1.1. The ongoing experiments are based on the SpeechDat II databases 1 . So far, experiments included the Slovenian, Spanish and German 1000 FDB SpeechDat (II) database. Current results are very promising. The use of clustering algorithm resulted in a significant reduction of the number of triphones at acceptable level of word and language identification accuracy degradation.
Cite as: Imperl, B., Horvat, B. (1999) The clustering algorithm for the definition of multilingual set of context dependent speech models. Proc. 6th European Conference on Speech Communication and Technology (Eurospeech 1999), 887-890, doi: 10.21437/Eurospeech.1999-216
@inproceedings{imperl99_eurospeech, author={Bojan Imperl and Bogomir Horvat}, title={{The clustering algorithm for the definition of multilingual set of context dependent speech models}}, year=1999, booktitle={Proc. 6th European Conference on Speech Communication and Technology (Eurospeech 1999)}, pages={887--890}, doi={10.21437/Eurospeech.1999-216} }