8th International Conference on Spoken Language Processing

Jeju Island, Korea
October 4-8, 2004

Hidden Factor Dynamic Bayesian Networks for Speech Recognition

Filipp Korkmazsky, Murat Deviren, Dominique Fohr, Irina Illina

LORIA, France

This paper presents a novel approach to modeling speech data by Dynamic Bayesian Networks. Instead of defining a specific set of factors that affect speech signals the factors are modeled implicitly by speech data clustering. Different data clusters correspond to different subsets of the factor values. These subsets are represented by the corresponding factor states. The factor states along with the phone states represent 2 hidden layers in the Hidden Factor Dynamic Bayesian Network (HFDBN). In this study we proved that Hidden Factor Dynamic Bayesian Networks provide a better speech recognition performance than HMMs of equal complexity. Speech recognition experiments were conducted on the speech data recorded in a moving car and demonstrated advantage of using HFDBN over HMM for clean and noisy speech data recognition.

Full Paper

Bibliographic reference.  Korkmazsky, Filipp / Deviren, Murat / Fohr, Dominique / Illina, Irina (2004): "Hidden factor dynamic Bayesian networks for speech recognition", In INTERSPEECH-2004, 2801-2804.