8th International Conference on Spoken Language Processing

Jeju Island, Korea
October 4-8, 2004

From Switchboard to Meetings: Development of the 2004 ICSI-SRI-UW Meeting Recognition System

Andreas Stolcke (1,2), Chuck Wooters (1), Ivan Bulyko (3), Martin Graciarena (2), Scott Otterson (3), Barbara Peskin (1), Mari Ostendorf (3), Dave Gelbart (1,4), Nikki Mirghafori (1), Tuomo Pirinen (1,5)

(1) International Computer Institute, USA
(2) SRI International, USA
(3) University of Washington, USA
(4) University of California at Berkeley, USA
(5) Tampere University of Technology, USA

We describe the ICSI-SRI-UW team's entry in the Spring 2004 NIST Meeting Recognition Evaluation. The system was derived from SRI's 5xRT Conversational Telephone Speech (CTS) recognizer by adapting CTS acoustic and language models to the meeting domain, adding noise reduction and delay-sum array processing processing for farfield recognition, and postprocessing for cross-talk suppression. A modified MAP adaptation procedure was developed to make best use of discriminatively trained (MMIE) prior models. These meeting-specific changes yielded an overall 9% and 22% relative improvement as compared to the orignal CTS system, and 16% and 29% relative improvement as compared to our 2002 Meeting Evaluation system, for the individual-headset and multiple-distant microphones conditions, respectively.

Full Paper

Bibliographic reference.  Stolcke, Andreas / Wooters, Chuck / Bulyko, Ivan / Graciarena, Martin / Otterson, Scott / Peskin, Barbara / Ostendorf, Mari / Gelbart, Dave / Mirghafori, Nikki / Pirinen, Tuomo (2004): "From switchboard to meetings: development of the 2004 ICSI-SRI-UW meeting recognition system", In INTERSPEECH-2004, 1957-1960.