EUROSPEECH 2003 - INTERSPEECH 2003
We describe a multi-domain, conversational test set developed for IBM's Superhuman speech recognition project and our 2002 benchmark system for this task. Through the use of multi-pass decoding, unsupervised adaptation and combination of hypotheses from systems using diverse feature sets and acoustic models, we achieve a word error rate of 32.0% on data drawn from voicemail messages, two-person conversations and multiple-person meetings.
Bibliographic reference. Kingsbury, Brian / Mangu, Lidia / Saon, George / Zweig, Geoffrey / Axelrod, Scott / Goel, Vaibhava / Visweswariah, Karthik / Picheny, Michael (2003): "Toward domain-independent conversational speech recognition", In EUROSPEECH-2003, 1881-1884.