Sixth European Conference on Speech Communication and Technology
The addition of a word normalized energy contour uniformly improves performance of the HMM recognizer and makes it more robust to difference in talker populations. This kind of normalization generally requires some information on the statistics of energy features over the whole utterance, which is not a feasible solution in real-time applications due to the unnecessary long processing delay. In this paper, we propose a more efficient implementation approach for energy feature normalization where the normalization coefficients are computed using a look-a-head delay of fixed length. The experimental results on German connected digit recognition task show that a 12% string error rate reduction is obtained by using a look-a-head delay energy normalization scheme when compared to without using the energy feature. Further reduction of 10% string error rate is achieved by integrating the speech/nonspeech decision mechanism.
Full Paper (PDF) Gnu-Zipped Postscript
Bibliographic reference. Chengalvarayan, Rathinavelu (1999): "Robust energy normalization using speech/nonspeech discriminator for German connected digit recognition", In EUROSPEECH'99, 61-64.