2nd Workshop on Spoken Language Technologies for Under-Resourced Languages

Universiti Sains, Penang, Malaysia
May 3-5, 2010

Development of a Speech-to-Text Transcription System for Finnish

Lori Lamel (1), Bianca Vieru (2)

(1) Spoken Language Processing Group, LIMSI-CNRS; (2) Vecsys Research; Orsay, France

This paper describes the development of a speech-to-text transcription system for the Finnish language. Finnish is a Finno-Ugric language spoken by about 6 million of people living in Finland, but also by some minorities in Sweden, Norway, Russia and Estonia. System development was carried out without any detailed manual transcriptions, relying instead on several sources of audio and textual data were found on the web. Some of the audio sources were associated with approximate (and usually partial) texts, which were used to provide estimates of system performance.

Full Paper

Bibliographic reference.  Lamel, Lori / Vieru, Bianca (2010): "Development of a speech-to-text transcription system for Finnish", In SLTU-2010, 62-67.