5th International Conference on Spoken Language Processing

Sydney, Australia
November 30 - December 4, 1998

Discriminative Weighting of Multi-Resolution Sub-Band Cepstral Features for Speech Recognition

Philip McMahon, Paul McCourt, Saeed Vaseghi

The Queen's University of Belfast, Ireland

This paper explores possible strategies for the recombination of independent multi-resolution sub-band based recognisers. The multi-resolution approach is based on the premise that additional cues for phonetic discrimination may exist in the spectral correlates of a particular sub-band, but not in another. Weights are derived via discriminative training using the 'Minimum Classification Error' (MCE) criterion on log-likelihood scores. Using this criterion the weights for correct and competing classes are adjusted in opposite directions, thus conveying the sense of enforcing separation of confusable classes. Discriminative re-combination is shown to provide significant increases for both phone classification and continuous recognition tasks on the TIMIT database. Weighted recombination of independent multi-resolution sub-band models is also shown to provide robustness improvements in broadband noise.

Full Paper

Bibliographic reference.  McMahon, Philip / McCourt, Paul / Vaseghi, Saeed (1998): "Discriminative weighting of multi-resolution sub-band cepstral features for speech recognition", In ICSLP-1998, paper 0315.