Sixth European Conference on Speech Communication and Technology

Budapest, Hungary
September 5-9, 1999

Top-down Bottom-up Hybrid Clustering Algorithm for Acoustic-Phonetic Modeling of Speech

José B. Marino, Albino Nogueiras-Rodríguez

TALP Research Center, Dpt. of Signal Theory and Communications, Universitat Politccnica de Catalunya, Barcelona, Spain

Obtaining a total contextual coverage and a smoothed training of phonetic models are the most relevant targets in acoustic modeling for continuous speech recognition systems. Traditionally, these objectives have been addressed using clustering techniques: either bottom-up or top-down ones. These two approaches yield complementary benefits. Because of their unrestricted optimization nature, bottom-up algorithms can reach a cluster configuration with a better homogeneity than top-down clustering can do. However, the phonetic guidance used by the latter gives a complete context generalization, allowing the provision of a model to contexts not found during the training step. In this paper a hybrid algorithm that gets the best properties of both approaches is reported. This algorithm is applied to the hidden Markov models of demiphones, a new phonetic unit introduced by the authors recently. The junction of the demiphone and this hybrid algorithm provides a noticeable saving in the size of the set of phonetic units without degrading the performance.

Full Paper (PDF)   Gnu-Zipped Postscript

Bibliographic reference.  Marino, José B. / Nogueiras-Rodríguez, Albino (1999): "Top-down bottom-up hybrid clustering algorithm for acoustic-phonetic modeling of speech", In EUROSPEECH'99, 1343-1346.