16th Annual Conference of the International Speech Communication Association

Dresden, Germany
September 6-10, 2015

Investigation of Parametric Rectified Linear Units for Noise Robust Speech Recognition

Sunil Sivadas (1), Zhenzhou Wu (2), Ma Bin (1)

(1) A*STAR, Singapore
(2) McGill University, Canada

Convolutional neural networks with rectified linear unit (ReLU) have been successful in speech recognition and computer vision tasks. ReLU was proposed as a better match to biological neural activation functions compared to sigmoidal non-linearity function. However, ReLU has a disadvantage that the gradient is zero whenever the unit is not active or saturated. To alleviate the potential problems due the zero gradient, Leaky ReLU (LReLU) was proposed. Recently, a parametrized form of ReLU (PReLU) was shown to give superior performance compared to ReLU on large scale computer vision tasks. PReLU is a generalized version of LReLU where the gradient is learned adaptively from the training data. In this paper we investigate PReLU based deep convolutional neural networks for noise robust speech recognition. We report experimental results on Aurora-4 multi-condition training task. We show that PReLU gives slightly better Word Error Rates (WERs) on noisy test sets compared to ReLU. In combination with dropout generalization method we report one of the best WERs in the literature for this noisy speech recognition task.

Full Paper

Bibliographic reference.  Sivadas, Sunil / Wu, Zhenzhou / Bin, Ma (2015): "Investigation of parametric rectified linear units for noise robust speech recognition", In INTERSPEECH-2015, 3234-3238.