7th International Conference on Spoken Language Processing

September 16-20, 2002
Denver, Colorado, USA

Robust Speech / Music Classification in Audio Documents

Julien Pinquier, Jean-Luc Rouas, Régine André-Obrecht

Institut de Recherche en Informatique de Toulouse, France

This paper deals with a novel approach to speech / music segmentation. Three original features, entropy modulation, stationary segment duration and number of segments are extracted. They are merged with the classical (4) Hz modulation energy. The relevance of these features is studied in a first experiment based on a development corpus composed of collected samples of speech and music. Another corpus is employed to verify the robustness of the algorithm. This experiment is made on a TV movie soundtrack and shows performances reaching a correct identification rate of 90%.

Full Paper

Bibliographic reference.  Pinquier, Julien / Rouas, Jean-Luc / André-Obrecht, Régine (2002): "Robust speech / music classification in audio documents", In ICSLP-2002, 2005-2008.