ISCA Archive IberSPEECH 2022
ISCA Archive IberSPEECH 2022

Respiratory Sound Classification Using an Attention LSTM Model with Mixup Data Augmentation

Noelia Salor-Burdalo, Ascension Gallardo-Antolin

Auscultation is the most common method for the diagnosis of respiratory diseases, although it depends largely on the physician’s ability. In order to alleviate this drawback, in this paper, we present an automatic system capable of distinguishing between different types of lung sounds (neutral, wheeze, crackle) in patient’s respiratory recordings. In particular, the proposed system is based on Long Short Term-Memory (LSTM) networks fed with log-mel spectrograms, on which several improvements have been developed. Firstly, the frequency bands that contain more useful information have been experimentally determined in order to enhance the input acoustic features. Secondly, an Attention Mechanism has been incorporated into the LSTM model in order to emphasize the more relevant audio frames to the task under consideration. Finally, a Mixup data augmentation technique has been adopted in order to mitigate the problem of data imbalance and improve the sensitivity of the system. The proposed methods have been evaluated over the publicly available ICBHI 2017 dataset, achieving good results in comparison to the baseline.


doi: 10.21437/IberSPEECH.2022-13

Cite as: Salor-Burdalo, N., Gallardo-Antolin, A. (2022) Respiratory Sound Classification Using an Attention LSTM Model with Mixup Data Augmentation . Proc. IberSPEECH 2022, 61-65, doi: 10.21437/IberSPEECH.2022-13

@inproceedings{salorburdalo22_iberspeech,
  author={Noelia Salor-Burdalo and Ascension Gallardo-Antolin},
  title={{Respiratory Sound Classification Using an Attention LSTM Model with Mixup Data Augmentation }},
  year=2022,
  booktitle={Proc. IberSPEECH 2022},
  pages={61--65},
  doi={10.21437/IberSPEECH.2022-13}
}