1st Joint SIG-IL/Microsoft Workshop on Speech and Language Technologies for Iberian Languages
Porto Salvo, Portugal
In the broadcast news domain audio segmentation is an important pre-processing step for other speech technologies like speech recognition and speech diarization. In this work we propose an architecture that allows to integrate the individual detections of various acoustic classes. By implementing a different algorithm adapted to the characteristics of each class, we can obtain much better results than using a generic detector for all classes. Additionally, new features suited to detect telephone channel speech over wideband music that improve the accuracy are also introduced.
Index Terms: audio segmentation, acoustic event detection, music detection, telephone speech, software architecture
Bibliographic reference. Aguilo, Mateu / Butko, Taras / Temko, Andrey / Nadeu, Climent (2009): "A hierarchical architecture for audio segmentation in a broadcast news task", In SLTECH-2009, 17-20.