In this paper, we explored a feature component masking scheme for an embedded tonal language recognition systems, in order to reduce the computational complexity with least degradation of recognition accuracy. We made a lot of experiments on a Mandarin isolated word recognition task with a tone-confusable vocabulary. Taking consideration of both clean and noisy conditions, we were able to find a masking scheme that filtered out 31 of 54 components and still outperformed the baseline with 54 components in feature set, with dramatically less computational and memory complexity. The results showed that feature masking was a promising approach for complexity reduction in embedded tonal language recognition systems. The results also verified the effectiveness of higher order cepstral coefficients for tonal language recognition because most of them were preserved during the feature masking experiments.
Cite as: Tang, Y., Wang, X., Cao, Y., Ding, F. (2004) Feature Masking in an Embedded Mandarin Speech Recognition System. Proc. International Symposium on Chinese Spoken Language Processing, 245-248
@inproceedings{tang04c_iscslp, author={Yuezhong Tang and Xia Wang and Yang Cao and Feng Ding}, title={{Feature Masking in an Embedded Mandarin Speech Recognition System}}, year=2004, booktitle={Proc. International Symposium on Chinese Spoken Language Processing}, pages={245--248} }