9th Annual Conference of the International Speech Communication Association

Brisbane, Australia
September 22-26, 2008

A Comparative Study in Automatic Recognition of Broadcast Audio

Stavros Ntalampiras, Nikos Fakotakis

University of Patras, Greece

This paper provides a thorough description of a methodology which leads to high accuracy as regards automatic analysis of broadcast audio. The main objective is to find a feature set for efficient speech/music discrimination while keeping the number of its dimensions as small as possible. Three groups of parameters based on Mel-scale filterbank, MPEG-7 standard and wavelet decomposition are examined in detail. We annotated on-line radio recordings characterized by great diversity, for building probabilistic models and testing four frameworks. The proposed approach utilizes wavelets and MPEG-7 ASP descriptor for modeling speech and music respectively, and results to 98.5% average recognition rate.

Full Paper

Bibliographic reference.  Ntalampiras, Stavros / Fakotakis, Nikos (2008): "A comparative study in automatic recognition of broadcast audio", In INTERSPEECH-2008, 2498-2501.