Description of the caip speech corpus

Qiguang Lin, ChiWei Che, Joe French

As part of our effort in developing a synergistically-integrated system of microphone arrays and neural networks (MANN) for robust large-vocabulary continuous speech recognition in variable acoustical environments, we recently collected a sizable amount of "hands-free" speech data. This database comprises stereo recordings of read speech using a head-mounted microphone, a desk-mounted microphone, and a 1-dimensional beamforming line array of 29-microphones. Both the desk-mounted microphone and the array were positioned 3 meters from the subject. This speech corpus has been utilized to evaluate the capability of the MANN system for robust speech/speaker recognition under adverse conditions. The purpose of this paper is to document the data collection process and present some results of acoustic analyses of collected data with an emphasis on array speech.

