ISCA Archive Interspeech 2008
ISCA Archive Interspeech 2008

A comparative study in automatic recognition of broadcast audio

Stavros Ntalampiras, Nikos Fakotakis

This paper provides a thorough description of a methodology which leads to high accuracy as regards automatic analysis of broadcast audio. The main objective is to find a feature set for efficient speech/music discrimination while keeping the number of its dimensions as small as possible. Three groups of parameters based on Mel-scale filterbank, MPEG-7 standard and wavelet decomposition are examined in detail. We annotated on-line radio recordings characterized by great diversity, for building probabilistic models and testing four frameworks. The proposed approach utilizes wavelets and MPEG-7 ASP descriptor for modeling speech and music respectively, and results to 98.5% average recognition rate.


doi: 10.21437/Interspeech.2008-619

Cite as: Ntalampiras, S., Fakotakis, N. (2008) A comparative study in automatic recognition of broadcast audio. Proc. Interspeech 2008, 2498-2501, doi: 10.21437/Interspeech.2008-619

@inproceedings{ntalampiras08_interspeech,
  author={Stavros Ntalampiras and Nikos Fakotakis},
  title={{A comparative study in automatic recognition of broadcast audio}},
  year=2008,
  booktitle={Proc. Interspeech 2008},
  pages={2498--2501},
  doi={10.21437/Interspeech.2008-619}
}