1st Joint SIG-IL/Microsoft Workshop on Speech and Language Technologies for Iberian Languages

Porto Salvo, Portugal
September 3-4, 2009

A Hierarchical Architecture for Audio Segmentation in a Broadcast News Task

Mateu Aguilo, Taras Butko, Andrey Temko, Climent Nadeu

Department of Signal Theory and Communications, TALP Research Center, Universitat Politècnica de Catalunya, Barcelona, Spain

In the broadcast news domain audio segmentation is an important pre-processing step for other speech technologies like speech recognition and speech diarization. In this work we propose an architecture that allows to integrate the individual detections of various acoustic classes. By implementing a different algorithm adapted to the characteristics of each class, we can obtain much better results than using a generic detector for all classes. Additionally, new features suited to detect telephone channel speech over wideband music that improve the accuracy are also introduced.

Index Terms: audio segmentation, acoustic event detection, music detection, telephone speech, software architecture

Full Paper

Bibliographic reference.  Aguilo, Mateu / Butko, Taras / Temko, Andrey / Nadeu, Climent (2009): "A hierarchical architecture for audio segmentation in a broadcast news task", In SLTECH-2009, 17-20.