8th Annual Conference of the International Speech Communication Association

Antwerp, Belgium
August 27-31, 2007

Automatically Learning the Units of Speech by Non-Negative Matrix Factorisation

Veronique Stouten, Kris Demuynck, Hugo Van hamme

Katholieke Universiteit Leuven, Belgium

We present an unsupervised technique to discover the (word-sized) speech units in which a corpus of utterances can be decomposed. First, a fixed-length high-dimensional vector representation of the utterances is obtained. Then, the resulting matrix is decomposed in terms of additive units by applying the non-negative matrix factorisation algorithm. On a small vocabulary task, the obtained basis vectors each represent one of the uttered words. We also investigate the amount of speech data that is needed to obtain a correct set of basis vectors. By decreasing the number of occurrences of the words in the corpus, an indication of the learning rate of the system is obtained.

Full Paper

Bibliographic reference.  Stouten, Veronique / Demuynck, Kris / hamme, Hugo Van (2007): "Automatically learning the units of speech by non-negative matrix factorisation", In INTERSPEECH-2007, 1937-1940.