Sixth International Conference on Spoken Language Processing (ICSLP 2000)

Beijing, China
October 16-20, 2000

Test of Several External Posterior Weighting Functions for Multiband Full Combination ASR

Hervé Glotin (2), Frédéric Berthommier (1)

(1) ICP la Communication Parlée, Grenoble, France
(2) IDIAP Inst. of Perceptual Artificial Intelligence, Martigny, Switzerland

Information about speech reliability can be extracted and then integrated in a recogniser by various means. The full combination (FC) approach allows the weight- ing of the posterior values estimated locally in the time frequency representation, according a speech reliability measure. Since most of the speech segments are voiced, we use a method exploiting the harmonicity of speech tos derive these weights. We test this method together with the direct integration of the a priori SNR. Then, we run speech recognition with di erent kind of weighting functions. The weights are continuous or binary values. This corresponds to a soft or to a hard decision function about the speech reliability, which is derived from an observable harmonicity index. Using a binary decision process, the e ect is, for each time frame, to collapse the set of combinations of sub-bands into a single com- bination. On the other hand, we substitute empirical values to these terms, including functions of the a priori SNR, which are continuous or discrete, but not based on a probabilistic estimation. We establish the average scores in % WER for a panel of noises at di erent levels, stationary or not, narrow-band or wide-band. All these functions are found to be sub-optimal comparatively to the constant weighting, but a robustness of the FC for narrow-band noises is observed.

Full Paper

Bibliographic reference.  Glotin, Hervé / Berthommier, Frédéric (2000): "Test of several external posterior weighting functions for multiband full combination ASR", In ICSLP-2000, vol.1, 333-336.