ISCA Archive SPSC 2022
ISCA Archive SPSC 2022

Speaker anonymization by pitch shifting based on time-scale modification

Candy Olivia Mawalim, Shogo Okada, Masashi Unoki

The increasing usage of speech in digital technology raises a privacy issue because speech contains biometric information. Several methods of dealing with this issue have been proposed, including speaker anonymization or de-identification. Speaker anonymization aims to suppress personally identifiable information (PII) while keeping the other speech properties, including linguistic information. In this study, we utilize time-scale modification (TSM) speech signal processing for speaker anonymization. Speech signal processing approaches are significantly less complex than the state-of-the-art x-vector-based speaker anonymization method because it does not require a training process. We propose anonymization methods using two major categories of TSM, synchronous overlap-add (SOLA)-based algorithm and phase vocoder-based TSM (PV-TSM). For evaluating our proposed methods, we utilize the standard objective evaluation introduced in the VoicePrivacy challenge. The results show that our method based on the PV-TSM balances privacy and utility metrics better than baseline systems, especially when evaluating with an automatic speaker verification (ASV) system in anonymized enrollment and anonymized trials (a-a). Further, our method outperformed the x-vector-based speaker method, which has limitations in its complex training process, low privacy in an a-a scenario, and low voice distinctiveness.

doi: 10.21437/SPSC.2022-7

Cite as: Mawalim, C.O., Okada, S., Unoki, M. (2022) Speaker anonymization by pitch shifting based on time-scale modification. Proc. 2nd Symposium on Security and Privacy in Speech Communication, 35-42, doi: 10.21437/SPSC.2022-7

  author={Candy Olivia Mawalim and Shogo Okada and Masashi Unoki},
  title={{Speaker anonymization by pitch shifting based on time-scale modification}},
  booktitle={Proc. 2nd Symposium on Security and Privacy in Speech Communication},