INTERSPEECH 2004 - ICSLP
This paper presents a novel approach to modeling speech data by Dynamic Bayesian Networks. Instead of defining a specific set of factors that affect speech signals the factors are modeled implicitly by speech data clustering. Different data clusters correspond to different subsets of the factor values. These subsets are represented by the corresponding factor states. The factor states along with the phone states represent 2 hidden layers in the Hidden Factor Dynamic Bayesian Network (HFDBN). In this study we proved that Hidden Factor Dynamic Bayesian Networks provide a better speech recognition performance than HMMs of equal complexity. Speech recognition experiments were conducted on the speech data recorded in a moving car and demonstrated advantage of using HFDBN over HMM for clean and noisy speech data recognition.
Bibliographic reference. Korkmazsky, Filipp / Deviren, Murat / Fohr, Dominique / Illina, Irina (2004): "Hidden factor dynamic Bayesian networks for speech recognition", In INTERSPEECH-2004, 2801-2804.