The SAIL LABS Media Mining Indexer and the CAVA Framework

Erinc Dikici, Gerhard Backfried, Jürgen Riedler


In today’s attention-driven news economy, rapid changes of topics and events go hand in hand with rapid changes of vocabulary and of language use. ASR systems aimed at transcribing contents pertaining to this fluid media landscape need to keep up-to-date in a continuous and dynamic manner. Static models, potentially created a long time ago, are hopelessly outdated within a short period of time. The frequent changes in vocabulary and wording need to be reflected in the models employed for optimal performance of transcription if one does not want to risk falling behind. In this demonstration paper we present the audio processing capabilities of the SAIL LABS Media Mining Indexer, and the CAVA Framework allowing semi-automatic and periodic updates of the ASR vocabulary and language model from relevant and new data.


Cite as: Dikici, E., Backfried, G., Riedler, J. (2019) The SAIL LABS Media Mining Indexer and the CAVA Framework. Proc. Interspeech 2019, 4630-4631.


@inproceedings{Dikici2019,
  author={Erinc Dikici and Gerhard Backfried and Jürgen Riedler},
  title={{The SAIL LABS Media Mining Indexer and the CAVA Framework}},
  year=2019,
  booktitle={Proc. Interspeech 2019},
  pages={4630--4631}
}