ISCA Archive Interspeech 2015
ISCA Archive Interspeech 2015

Spectrally selective dithering for distorted speech recognition

Michal Borsky, Petr Mizera, Petr Pollak

The performance of speech recognition systems can be significantly degraded if the speech spectrum is distorted. This includes situations such as the usage of an improper recording device, enhancement technique or speech coder. This paper presents a front-end compensation method called spectrally selective dithering aimed at reconstructing the spectral characteristics of nonlinearly distorted speech. The technique is designed to detect the suppressed frequency bands in the speech signal and add a weighted amount of additive noise. The detection algorithm is based on the smoothness of the excitation signal spectrum obtained through analyzing LPC filtration. The gain of the added noise is estimated from the unaffected frequency bands. The practical usability of the algorithm has been studied in the task of MP3 speech recognition for very low bit-rates. The obtained results have demonstrated the advantage of using the proposed technique. We achieved up to 1.85% absolute WER reduction using the standard HMM-GMM architecture in LVCSR task.

doi: 10.21437/Interspeech.2015-601

Cite as: Borsky, M., Mizera, P., Pollak, P. (2015) Spectrally selective dithering for distorted speech recognition. Proc. Interspeech 2015, 2858-2861, doi: 10.21437/Interspeech.2015-601

  author={Michal Borsky and Petr Mizera and Petr Pollak},
  title={{Spectrally selective dithering for distorted speech recognition}},
  booktitle={Proc. Interspeech 2015},