Robust speech recognition under varying acoustic conditions may be achieved by exploiting multiple sources of information in the speech signal. In addition to an acoustic signal representation, we use an articulatory representation consisting of pseudo-articulatory features as an additional information source. Hybrid ANN/HMM recognizers using either of these representations are evaluated on a continuous numbers recognition task (OGI Numbers95) under clean, reverberant and noisy conditions. An error analysis of preliminary recognition results shows that the different representations produce qualitatively different errors, which suggests a combination of both representations. We investigate various combination possibilities at the phoneme estimation level and show that significant improvements can been achieved under all three acoustic conditions.
Cite as: Kirchhoff, K. (1998) Combining articulatory and acoustic information for speech recognition in noisy and reverberant environments. Proc. 5th International Conference on Spoken Language Processing (ICSLP 1998), paper 0873, doi: 10.21437/ICSLP.1998-313
@inproceedings{kirchhoff98_icslp, author={Katrin Kirchhoff}, title={{Combining articulatory and acoustic information for speech recognition in noisy and reverberant environments}}, year=1998, booktitle={Proc. 5th International Conference on Spoken Language Processing (ICSLP 1998)}, pages={paper 0873}, doi={10.21437/ICSLP.1998-313} }