ISCA Tutorial and Research Workshop on Statistical and Perceptual Audio Processing

ICC Jeju, Korea
October 3, 2004

Features for Segmenting and Classifying Long-Duration Recordings of "Personal" Audio

Daniel P. W. Ellis, Keansub Lee

LabROSA, Dept. of Electrical Engineering, Columbia University, New York, NY, USA

A digital recorder weighing ounces and able to record for more than ten hours can be bought for a few hundred dollars. Such devices make possible continuous recordings of "personal audio" - storing essentially everything heard by the owner. Without automatic indexing, however, such recordings are almost useless. In this paper, we describe some experiments with recordings of this kind, focusing on the problem of segmenting the recordings into different ‘episodes’ corresponding to different acoustic environments experienced by the device. We describe several novel features to describe 1-minute-long frames of audio, and investigate their effectiveness at reproducing hand-labeled ground-truth segment boundaries.


Full Paper

Bibliographic reference.  Ellis, Daniel P. W. / Lee, Keansub (2004): "Features for segmenting and classifying long-duration recordings of "personal" audio", In SAPA-2004, paper 106.