Auditory Filterbank Learning Using ConvRBM for Infant Cry Classification

Hardik B. Sailor, Hemant Patil


The infant cry classification is a socially-relevant problem where the task is to classify the normal vs. pathological cry signals. Since the cry signals are very different from the speech signals in terms of temporal and spectral content, there is a need for better feature representation for infant cry signals. In this paper, we propose to use unsupervised auditory filterbank learning using Convolutional Restricted Boltzmann Machine (ConvRBM). Analysis of the subband filters shows that most of the subband filters are Fourier-like basis functions. The infant cry classification experiments were performed on the two databases, namely, DA-IICT Cry and Baby Chillanto. The experimental results show that the proposed features perform better than the standard Mel Frequency Cepstral Coefficients (MFCC) using various statistically meaningful performance measures. In particular, our proposed ConvRBM-based features obtained an absolute improvement of 2% and 0.58% in the classification accuracy on the DA-IICT Cry and the Baby Chillanto database, respectively.


 DOI: 10.21437/Interspeech.2018-1536

Cite as: Sailor, H.B., Patil, H. (2018) Auditory Filterbank Learning Using ConvRBM for Infant Cry Classification. Proc. Interspeech 2018, 706-710, DOI: 10.21437/Interspeech.2018-1536.


@inproceedings{Sailor2018,
  author={Hardik B. Sailor and Hemant Patil},
  title={Auditory Filterbank Learning Using ConvRBM for Infant Cry Classification},
  year=2018,
  booktitle={Proc. Interspeech 2018},
  pages={706--710},
  doi={10.21437/Interspeech.2018-1536},
  url={http://dx.doi.org/10.21437/Interspeech.2018-1536}
}