Indoor/Outdoor Audio Classification Using Foreground Speech Segmentation

Banriskhem K. Khonglah, K.T. Deepak, S.R. Mahadeva Prasanna


The task of indoor/ outdoor audio classification using foreground speech segmentation is attempted in this work. Foreground speech segmentation is the use of features to segment between foreground speech and background interfering sources like noise. Initially, the foreground and background segments are obtained from foreground speech segmentation by using the normalized autocorrelation peak strength (NAPS) of the zero frequency filtered signal (ZFFS) as a feature. The background segments are then considered for determining whether a particular segment is an indoor or outdoor audio sample. The mel frequency cepstral coefficients are obtained from the background segments of both the indoor and outdoor audio samples and are used to train the Support Vector Machine (SVM) classifier. The use of foreground speech segmentation gives a promising performance for the indoor/ outdoor audio classification task.


 DOI: 10.21437/Interspeech.2017-309

Cite as: Khonglah, B.K., Deepak, K., Prasanna, S.M. (2017) Indoor/Outdoor Audio Classification Using Foreground Speech Segmentation. Proc. Interspeech 2017, 464-468, DOI: 10.21437/Interspeech.2017-309.


@inproceedings{Khonglah2017,
  author={Banriskhem K. Khonglah and K.T. Deepak and S.R. Mahadeva Prasanna},
  title={Indoor/Outdoor Audio Classification Using Foreground Speech Segmentation},
  year=2017,
  booktitle={Proc. Interspeech 2017},
  pages={464--468},
  doi={10.21437/Interspeech.2017-309},
  url={http://dx.doi.org/10.21437/Interspeech.2017-309}
}