This paper presents the work that we conducted for building the speech activity detection (SAD) systems for the phase 3 evaluation of the RATS program. The work focused on improving the SAD performance with the neural network (NN) approach. The major efforts include reducing the false rejections errors by extensions of speech regions in the training references and use of post-processing NNs, and removing channel variations by design of channel bottleneck features with the deep NN learning approach. With these efforts more 25% relative improvements were achieved over the phase 2 evaluation system. The bigger contribution of the design of the bottleneck features was the enhancement of the SAD system performance on new channels. Our results revealed that the bottleneck features were able to improve SAD performance on new channels significantly.
Bibliographic reference. Ma, Jeff (2014): "Improving the speech activity detection for the DARPA RATS phase-3 evaluation", In INTERSPEECH-2014, 1558-1562.