EUROSPEECH '97
5th European Conference on Speech Communication and Technology

Rhodes, Greece
September 22-25, 1997


Bottom-Up and Top-Down State Clustering for Robust Acoustic Modeling

Cristina Chesta (1), Pietro Laface (1), Franco Ravera (2)

(1) Dipartimento di Automatica e Informatica - Politecnico di Torino, Italy (2) CSELT- Centro Studi e Laboratori Telecomunicazioni, Torino, Italy

In this paper we describe our experience with bottom- up and top- down state clustering techniques for the definition and training of robust acoustic-phonetic units. Using as a test-bed a speaker- independent telephone- speech isolated word recognition task with a vocabulary including 475 city names, we show that similar performances are obtained by tying the HMM states both with an agglomerative or a decision-tree clustering approach. Moreover, better results are obtained by a priori selecting the set of states that can be clustered, rather than relying solely on their acoustical similarity. In the bottom-up approach a stopping criterion for the furthest neighbor clustering procedure is proposed that does not require a threshold. In the top-down approach we show that a careful selected impurity function allows lookahead search to outperforms the classical decision tree growing algorithm.

Full Paper

Bibliographic reference.  Chesta, Cristina / Laface, Pietro / Ravera, Franco (1997): "Bottom-up and top-down state clustering for robust acoustic modeling", In EUROSPEECH-1997, 11-14.