This paper investigates the effects of two types of imperfection, namely detection errors and articulatory feature asynchrony, of the front-end articulatory feature detector on the performance of a detection-based ASR system. Based on a set of variable-controlled experiments, we find that articulatory feature asynchrony is the major issue that should be addressed in detection-based ASR. To this end, we propose several methods to reduce the asynchrony or the effects of asynchrony. The results are quite promising; for example, currently, we can achieve 67.67% phone accuracy in the TIMIT free phone recognition task with only 11 binary-valued articulatory features.
Cite as: Chen, I.-F., Wang, H.-M. (2009) Articulatory feature asynchrony analysis and compensation in detection-based ASR. Proc. Interspeech 2009, 3059-3062, doi: 10.21437/Interspeech.2009-568
@inproceedings{chen09c_interspeech, author={I-Fan Chen and Hsin-Min Wang}, title={{Articulatory feature asynchrony analysis and compensation in detection-based ASR}}, year=2009, booktitle={Proc. Interspeech 2009}, pages={3059--3062}, doi={10.21437/Interspeech.2009-568} }