This paper introduces VoxCeleb-PT, a small dataset of voices of Portuguese celebrities that can be used as a language-specific extension of the widely used VoxCeleb corpus. Besides introducing the corpus, we also describe three lab assignments where it was used in a one-semester speech processing course: age regression, speaker verification and speech recognition, hoping to highlight the relevance of this dataset as a pedagogical tool. Additionally, this paper confirms the overall limitations of current systems when evaluated in different languages and acoustic conditions: we found an overall degradation of performance on all of the proposed tasks.
Cite as: Mendonca, J., Trancoso, I. (2022) VoxCeleb-PT – a dataset for a speech processing course . Proc. IberSPEECH 2022, 71-75, doi: 10.21437/IberSPEECH.2022-15
@inproceedings{mendonca22_iberspeech, author={John Mendonca and Isabel Trancoso}, title={{VoxCeleb-PT – a dataset for a speech processing course }}, year=2022, booktitle={Proc. IberSPEECH 2022}, pages={71--75}, doi={10.21437/IberSPEECH.2022-15} }