Sixth International Conference on Spoken Language Processing
Generally, the input feature to the recognizer used for recognition and modeling has been extended to include dynamic information about the first and second order derivatives of the cepstral features, energy as well as the information about the cepstrum and the peak normalized energy. The problem with energy normalization approach is that it is not suitable for real-time application since it introduces long delays in determining the peak energy. In this paper, we propose a more efficient implementation approach for energy feature transformation where the energy feature is mapped into a scale of 0 to 1 using a sigmoid function and hence avoiding the need for energy normalization. The experimental results on Tamil connected digit recognition task show that a 20% string error rate reduction is obtained by using the proposed nonlinear energy transformation scheme when compared to using untransformed raw energy feature.
Bibliographic reference. Chengalvarayan, Rathinavelu (2000): "The use of nonlinear energy transformation for Tamil connected-digit speech recognition", In ICSLP-2000, vol.3, 1121-1124.