ISCA Archive ICSLP 1998
ISCA Archive ICSLP 1998

Speech, silence, music and noise classification of TV broadcast material

Ara Samouelian, Jordi Robert-Ribes, Mike Plumpe

Speech processing can be of great help for indexing and archiving TV broadcast material. Broadcasting station standards will be soon digital. There will be a huge increase in the use of speech processing techniques for maintaining the archives as well as accessing them. We present an application of information theory to the classification and automatic labelling of TV broadcast material into speech, music and noise. We use information theory to construct a decision tree from several different TV programs and then apply it to a different set of TV programs. We present classification results on training and test data sets. Frame level correct classification rate, for training data was 95.5%, while for test data it ranged from 60.4% to 84.5%, depending on TV program type. At the segment level, correct recognition rate and accuracy on train data were 100% and 95.1%, respectively while for test data the % correct ranged from 80% to 100% and %accuracy ranged from 64.7% to 100%.


doi: 10.21437/ICSLP.1998-548

Cite as: Samouelian, A., Robert-Ribes, J., Plumpe, M. (1998) Speech, silence, music and noise classification of TV broadcast material. Proc. 5th International Conference on Spoken Language Processing (ICSLP 1998), paper 0620, doi: 10.21437/ICSLP.1998-548

@inproceedings{samouelian98_icslp,
  author={Ara Samouelian and Jordi Robert-Ribes and Mike Plumpe},
  title={{Speech, silence, music and noise classification of TV broadcast material}},
  year=1998,
  booktitle={Proc. 5th International Conference on Spoken Language Processing (ICSLP 1998)},
  pages={paper 0620},
  doi={10.21437/ICSLP.1998-548}
}