Interspeech'2005 - Eurospeech

Lisbon, Portugal
September 4-8, 2005

The LIUM Speech Transcription System: A CMU Sphinx III-Based System for French Broadcast News

Paul Deléglise, Yannick Estève, Sylvain Meignier, Teva Merlin

LIUM-CNRS, France

This paper presents the system used by the LIUM to participate in ESTER, the french broadcast news evaluation campaign. This system is based on the CMU Sphinx 3.3 (fast) decoder. Some tools are presented which have been added on different steps of the Sphinx recognition process: segmentation, acoustic model adaptation, word-lattice rescoring.

Several experiments have been conducted on studying the effects of the signal segmentation on the recognition process, on injecting automatically transcribed data into training corpora, or on testing different approaches for acoustic model adaptation. The results are presented in this paper.

With very few modifications and a simple MAP acoustic model estimation, Sphinx3.3 decoder reached a word error rate of 28.2%. The entire system developed by LIUM obtained 23.6% as official word error rate for the ESTER evaluation, and 23.4% as result of an unsubmitted system.

Full Paper

Bibliographic reference.  Deléglise, Paul / Estève, Yannick / Meignier, Sylvain / Merlin, Teva (2005): "The LIUM speech transcription system: a CMU Sphinx III-based system for French broadcast news", In INTERSPEECH-2005, 1653-1656.