Sixth European Conference on Speech Communication and Technology

Budapest, Hungary
September 5-9, 1999

Progress in Automatic Meeting Transcription

Hua Yu, Michael Finke, Alex Waibel

Interactive Systems Laboratories, Carnegie Mellon University, Pittsburgh, PA, USA

In this paper we report recent developments on the meeting transcription task, a large vocabulary conversational speech recognition task. Previous experiments showed this is a very challenging task, with about 50% word error rate (WER) using existing recognizers. The difficulty mostly comes from highly disfluent/conversational nature of meetings, and lack of domain specific training data. For the first problem, our SWB(Switchboard) system — a conversational telephone speech recognizer —was used to recognize wide-band meeting data; for the latter, we leveraged the large amount of Broadcast News (BN) data to build a robust system. This paper will especially focus on two experiments in the BN system development: model combination and HMM topology/duration model-ing. Model combination can be done at various stages of recognition: post-processing schemes such as ROVER can lead to significant improvements; to reduce computation we tried model combination at acoustic score level. We will also show the importance of temporal constraints in decoding, present some HMM topology/duration modeling experiments. Finally, the meeting browser system and meeting room setup will be reviewed.

Full Paper (PDF)   Gnu-Zipped Postscript

Bibliographic reference.  Yu, Hua / Finke, Michael / Waibel, Alex (1999): "Progress in automatic meeting transcription", In EUROSPEECH'99, 695-698.