Compensating Gender Variability in Query-by-Example Search on Speech Using Voice Conversion

Paula Lopez-Otero, Laura Docio-Fernandez, Carmen Garcia-Mateo


The huge amount of available spoken documents has raised the need for tools to perform automatic searches within large audio databases. These collections usually consist of documents with a great variability regarding speaker, language or recording channel, among others. Reducing this variability would boost the performance of query-by-example search on speech systems, especially in zero-resource systems that use acoustic features for audio representation. Hence, in this work, a technique to compensate the variability caused by speaker gender is proposed. Given a data collection composed of documents spoken by both male and female voices, every time a spoken query has to be searched, an alternative version of the query on its opposite gender is generated using voice conversion. After that, the female version of the query is used to search within documents spoken by females and vice versa. Experimental validation of the proposed strategy shows an improvement of search on speech performance caused by the reduction of gender variability.


 DOI: 10.21437/Interspeech.2017-1183

Cite as: Lopez-Otero, P., Docio-Fernandez, L., Garcia-Mateo, C. (2017) Compensating Gender Variability in Query-by-Example Search on Speech Using Voice Conversion. Proc. Interspeech 2017, 2909-2913, DOI: 10.21437/Interspeech.2017-1183.


@inproceedings{Lopez-Otero2017,
  author={Paula Lopez-Otero and Laura Docio-Fernandez and Carmen Garcia-Mateo},
  title={Compensating Gender Variability in Query-by-Example Search on Speech Using Voice Conversion},
  year=2017,
  booktitle={Proc. Interspeech 2017},
  pages={2909--2913},
  doi={10.21437/Interspeech.2017-1183},
  url={http://dx.doi.org/10.21437/Interspeech.2017-1183}
}