EUROSPEECH 2003 - INTERSPEECH 2003
8th European Conference on Speech Communication and Technology

Geneva, Switzerland
September 1-4, 2003

        

Unlimited Vocabulary Speech Recognition Based on Morphs Discovered in an Unsupervised Manner

Vesa Siivola, Teemu Hirsimaki, Mathias Creutz, Mikko Kurimo

Helsinki University of Technology, Finland

We study continuous speech recognition based on sub-word units found in an unsupervised fashion. For agglutinative languages like Finnish, traditional word-based n-gram language modeling does not work well due to the huge number of different word forms. We use a method based on the Minimum Description Length principle to split words statistically into subword units allowing efficient language modeling and unlimited vocabulary. The perplexity and speech recognition experiments on Finnish speech data show that the resulting model outperforms both word and syllable based trigram models. Compared to the word trigram model, the out-of-vocabulary rate is reduced from 20% to 0% and the word error rate from 56% to 32%. Tutkimme ohjaamattomasti loydettyihin sanaa lyhyempiin yksikoihin perustuvaa jatkuvan puheen tunnistusta. Perinteiset sanoihin perustuvat n-grammikielimallit toimivat huonosti agglutinatiivisille kielille kuten suomi, silla naissa kielissa on erittain paljon erilaisia sanamuotoja. Tassa tyossa kaytamme lyhyimpaan kuvauspituuteen (Minimum Description Length, MDL) perustuvaa menetelmaa sanojen tilastolliseen pilkkomiseen. Nain saamme tehokkaan kielimallin, jolla on rajoittamaton sanasto. Kokeet suomenkielisella aineistolla osoittavat, etta tama malli toimii selvasti seka sana- etta tavupohjaisia malleja paremmin. Sanapohjaiseen trigrammimalliin verrattuna sanastosta puuttuvien sanojen osuus tippuu 20 prosentista nollaan prosenttiin ja puheentunnistimen sanavirhe 56 prosentista 32 prosenttiin.

Full Paper

Bibliographic reference.  Siivola, Vesa / Hirsimaki, Teemu / Creutz, Mathias / Kurimo, Mikko (2003): "Unlimited vocabulary speech recognition based on morphs discovered in an unsupervised manner", In EUROSPEECH-2003, 2293-2296.