5th European Conference on Speech Communication and Technology

Rhodes, Greece
September 22-25, 1997

Viterbi Based Splitting of Phoneme HMM's

Luis Javier Rodriguez, Ines M. Torres

Dpto. Electricidad y Electrónica, Fac. Ciencias, Universidad del País Vasco, Bilbao, Spain

Continuous Speech Recognition Systems (CSR) usually include large sets of context dependent units to model contextual variations in the pronunciation of phones. The goal of this work was to obtain adequate sets of sub-lexical models by using acoustic information but excluding any previous phonological knowledge. At each iteration of a classical Viterbi training scheme each acoustic model was split into a set of more accurate models. This approach was evaluated over a Spanish acoustic phonetic decoding task. The experimental results showed that this approach produces similar recognition rates than classical triphones.

