ISCA Archive Interspeech 2005
ISCA Archive Interspeech 2005

Histogram-based quantization (HQ) for robust and scalable distributed speech recognition

Chia-yu Wan, Lin-Shan Lee

The performance of conventional distance-based vector quantization (VQ) for distributed speech recognition (DSR) is inevitably degraded by the environmental noise and quantization distortion. The pre-trained codebook is less scalable and may not be matched with the testing speech. A new concept of Histogram-based Quantization (HQ) is proposed in this paper, in which the quantization levels are dynamically defined by the histogram or order statistics of a segment of local most recent past samples of the parameter to be quantized. All problems with a pre-trained codebook is automatically solved because the pre-trained codebook is not used at all. The computation requirement is low because no distance measure is needed. The approach is robust because most disturbances (including very non-stationary types) can be absorbed by the dynamic histogram. Extensive experiments with AURORA 2.0 testing environment indicated the new approach is highly robust and scalable, suitable for future personalized and context-aware DSR environment.

doi: 10.21437/Interspeech.2005-228

Cite as: Wan, C.-y., Lee, L.-S. (2005) Histogram-based quantization (HQ) for robust and scalable distributed speech recognition. Proc. Interspeech 2005, 957-960, doi: 10.21437/Interspeech.2005-228

  author={Chia-yu Wan and Lin-Shan Lee},
  title={{Histogram-based quantization (HQ) for robust and scalable distributed speech recognition}},
  booktitle={Proc. Interspeech 2005},