Speech Database and Protocol Validation Using Waveform Entropy

Itshak Lapidot, Héctor Delgado, Massimiliano Todisco, Nicholas Evans, Jean-François Bonastre


The assessment of performance for any number of speech processing tasks calls for the use of a suitably large, representative dataset. Dataset design is crucial so as to ensure that any significant variation unrelated to the task in hand is adequately normalised or marginalised. Most datasets are partitioned into training, development and evaluation subsets. Depending on the task, the nature of these three subsets should normally be close to identical. With speech signals being subject to a multitude of different influences, e.g. speaker gender and age, language, dialect, utterance length, etc., the design and validation of speech datasets can become especially challenging. Even if many sources of variation unrelated to the task in hand can easily be marginalised, other sources of more subtle variation can easily be overlooked. Imbalances between training, development and evaluation partitions, can bring into question findings derived from their use. Stringent dataset validation procedures are required. This paper reports a particularly straightforward approach to dataset validation that is based upon waveform entropy.


 DOI: 10.21437/Interspeech.2018-2330

Cite as: Lapidot, I., Delgado, H., Todisco, M., Evans, N., Bonastre, J. (2018) Speech Database and Protocol Validation Using Waveform Entropy. Proc. Interspeech 2018, 2773-2777, DOI: 10.21437/Interspeech.2018-2330.


@inproceedings{Lapidot2018,
  author={Itshak Lapidot and Héctor Delgado and Massimiliano Todisco and Nicholas Evans and Jean-François Bonastre},
  title={Speech Database and Protocol Validation Using Waveform Entropy},
  year=2018,
  booktitle={Proc. Interspeech 2018},
  pages={2773--2777},
  doi={10.21437/Interspeech.2018-2330},
  url={http://dx.doi.org/10.21437/Interspeech.2018-2330}
}