ISCA Archive Eurospeech 1999
ISCA Archive Eurospeech 1999

The clustering algorithm for the definition of multilingual set of context dependent speech models

Bojan Imperl, Bogomir Horvat

The paper addresses the problem of designing a language independent phonetic inventory for the speech recognisers with multilingual vocabulary. A new clustering algorithm for the definition of multilingual set of triphones is proposed. The clustering algorithm bases on a definition of a distance measure for triphones defined as a weighted sum of explicit estimates of the context similarity on a monophone level. The monophone similarity estimation method based on the algorithm of Houtgast. The clustering algorithm is integrated in a multilingual speech recognition system based on HTK V2.1.1. The ongoing experiments are based on the SpeechDat II databases 1 . So far, experiments included the Slovenian, Spanish and German 1000 FDB SpeechDat (II) database. Current results are very promising. The use of clustering algorithm resulted in a significant reduction of the number of triphones at acceptable level of word and language identification accuracy degradation.


doi: 10.21437/Eurospeech.1999-216

Cite as: Imperl, B., Horvat, B. (1999) The clustering algorithm for the definition of multilingual set of context dependent speech models. Proc. 6th European Conference on Speech Communication and Technology (Eurospeech 1999), 887-890, doi: 10.21437/Eurospeech.1999-216

@inproceedings{imperl99_eurospeech,
  author={Bojan Imperl and Bogomir Horvat},
  title={{The clustering algorithm for the definition of multilingual set of context dependent speech models}},
  year=1999,
  booktitle={Proc. 6th European Conference on Speech Communication and Technology (Eurospeech 1999)},
  pages={887--890},
  doi={10.21437/Eurospeech.1999-216}
}