The ACLEW DiViMe: An Easy-to-use Diarization Tool

Adrien Le Franc, Eric Riebling, Julien Karadayi, Yun Wang, Camila Scaff, Florian Metze, Alejandrina Cristia

We present "DiViMe", an open-source virtual machine aimed at packaging speech technology for real-life data and developed in the context of the "Analyzing Children's Language Environments across the World" Project. This first release focuses on Speech Activity Detection, Speaker Diarization and their evaluation. The present paper introduces the set of included tools and the current workflow, which is focused on making minimal assumptions regarding users' technical skills. Additionally, we show how the current DiViMe tools fare against three sets of challenging data. In a first experiment, we look at performance with samples extracted from daylong recordings gathered using the LENA{TM}, system from English-learning children. We find that the performance of the tools currently in DiViMe is not far from that achieved by the lena proprietary software. In a second experiment, we generalize to other samples of child-centered daylong files, gathered with non-LENA{TM}, hardware from non-English-learning children, showing that performance does not degrade in this condition. Finally, we report on performance in the DiHARD 2018 Challenge Test Data. Originally conceived in the "Speech Recognition Virtual Kitchen", DiViMe is a promising platform for packaging speech technology tools for widespread re-use, with potential impact on both fundamental and applied speech and language research.

 DOI: 10.21437/Interspeech.2018-2324

