12th Annual Conference of the International Speech Communication Association

Florence, Italy
August 27-31. 2011

The JSafran Platform for Semi-Automatic Speech Processing

Christophe Cerisara, Claire Gardent

LORIA, France

JSafran is an open-source Java platform for editing, annotating and transforming speech corpora both manually and automatically at many levels: transcription, alignment, morphosyntactic tagging, syntactic parsing and semantic roles labelling. It integrates preconfigured state-of-the-art libraries for this purpose, including the Sphinx4, TreeTagger, OpenNLP, MaltParser and MATE applications, as well as the companion JTrans software for text-to-speech alignment and transcription. Despite the complexity of such speech processing tasks, JSafran has been designed to maximize simplicity both for the end-user, thanks to an easy-to-use GUI that controls all of these automatic and manual annotation functionalities, and for the developer, thanks to well-defined interfaces and to the multilevel stand-off annotation paradigm. JSafran has been used so far for several tasks, including the creation of a new French treebank on top of the broadcast news ESTER corpus.

Full Paper

Bibliographic reference.  Cerisara, Christophe / Gardent, Claire (2011): "The JSafran platform for semi-automatic speech processing", In INTERSPEECH-2011, 3241-3244.