Spatial, Temporal and Spectral Multiresolution Analysis for the INTERSPEECH 2019 ComParE Challenge

Marie-José Caraty, Claude Montacié


The INTERSPEECH 2019 Orca Activity Challenge consists in the detection of the Orca sounds from underwater audio signal. Orca can produce a wide variety of sounds categorized in clicks, whistles and pulsed calls. Clicks are useful for echolocation, whistles and pulsed calls are used as social signals. Experiments were conducted on DeepAL Fieldwork Data (DLFD). Underwater sounds were recorded in northern British Columbia by a hydrophones array. Recordings were labeled by marine biologists in Orca sounds or Noise. We have investigated multiresolution analysis according to the three main relevant acoustic levels: spatial, temporal and spectral. For this purpose, we studied the beamforming array analysis, the multitemporal resolution and the multilevel wavelet decomposition. For the spatial level, a beamforming algorithm was used for denoising the underwater audio signal. For the temporal level, two sets of multitemporal three-level features were extracted using pyramidal representation. For the spectral level, in order to detect transient sound, wavelet analysis was computed using various wavelet families. At last, an Orca Activity detector was designed combining ComParE set with multitemporal and multilevel wavelet features. Experiments on the Test set have shown a significant improvement of 0.051, compared to the baseline performance of the Challenge (0.866).


 DOI: 10.21437/Interspeech.2019-1693

Cite as: Caraty, M., Montacié, C. (2019) Spatial, Temporal and Spectral Multiresolution Analysis for the INTERSPEECH 2019 ComParE Challenge. Proc. Interspeech 2019, 2428-2432, DOI: 10.21437/Interspeech.2019-1693.


@inproceedings{Caraty2019,
  author={Marie-José Caraty and Claude Montacié},
  title={{Spatial, Temporal and Spectral Multiresolution Analysis for the INTERSPEECH 2019 ComParE Challenge}},
  year=2019,
  booktitle={Proc. Interspeech 2019},
  pages={2428--2432},
  doi={10.21437/Interspeech.2019-1693},
  url={http://dx.doi.org/10.21437/Interspeech.2019-1693}
}