EUROSPEECH 2003 - INTERSPEECH 2003
8th European Conference on Speech Communication and Technology

Geneva, Switzerland
September 1-4, 2003

        

Hidden Feature Models for Speech Recognition Using Dynamic Bayesian Networks

Karen Livescu (1), James Glass (1), Jeff Bilmes (2)

(1) Massachusetts Institute of Technology, USA
(2) University of Washington, USA

In this paper, we investigate the use of dynamic Bayesian networks (DBNs) to explicitly represent models of hidden features, such as articulatory or other phonological features, for automatic speech recognition. In previous work using the idea of hidden features, the representation has typically been implicit, relying on a single hidden state to represent a combination of features. We present a class of DBN-based hidden feature models, and show that such a representation can be not only more expressive but also more parsimonious. We also describe a way of representing the acoustic observation model with fewer distributions using a product of models, each corresponding to a subset of the features. Finally, we describe our recent experiments using hidden feature models on the Aurora 2.0 corpus.

Full Paper

Bibliographic reference.  Livescu, Karen / Glass, James / Bilmes, Jeff (2003): "Hidden feature models for speech recognition using dynamic Bayesian networks", In EUROSPEECH-2003, 2529-2532.