![]() |
SAPA-SCALE Conference 2012Portland, OR, USA |
![]() |
State-of-the-art solutions in ASR often rely on large amounts of expert prior knowledge, which is undesirable in some applications. In this paper, we consider a NMFbased framework that learns a small vocabulary of words directly from input data, without prior knowledge such as phone sets and dictionaries. In the context of this learning scheme, we compare several spectral representations of speech. Where necessary, we propose changes to their derivation to avoid the usage of prior linguistic knowledge. Also, in a comparison of several acoustic modelling techniques, we determine what model properties are beneficial to the frameworks performance.
Index Terms: keyword learning, non-negative matrix factorisation, clustering, acoustic modelling
Bibliographic reference. Driesen, Joris / Gemmeke, Jort F. / Van hamme, Hugo (2012): "Data-driven speech representations for NMF-based word learning", In SAPA-SCALE-2012, 98-103.