Interspeech'2005 - Eurospeech
We present a conversational telephone speech data set designed to support research on novel acoustic models. Small vocabulary tasks from 10 words up to 500 words are defined using subsets of the Switchboard-1 corpus; each task has a completely closed vocabulary (an OOV rate of 0%). We justify the need for these tasks, describe the algorithm for selecting them from a large corpus, give a statistical analysis of the data and present baseline whole-word hidden Markov model recognition results. The goal of the paper is to define a common data set and to encourage other researchers to use it.
Bibliographic reference. King, Simon / Bartels, Chris / Bilmes, Jeff (2005): "SVitchboard 1: small vocabulary tasks from Switchboard", In INTERSPEECH-2005, 3385-3388.