EUROSPEECH 2003 - INTERSPEECH 2003
Noise robustness is one of the most challenging problems in speech recognition research. In this work, we propose a noise robust and computationally simple system for small vocabulary speech recognition. We approach the noise robust digit recognition problem with the missing frames idea. The key point behind the missing frames idea is that frames with energies below a certain threshold are considered unreliable frames. We set these frames to a silence floor and treat them as silence frames. Performing this operation only in decoding stage creates high mismatch between trained speech and decoded speech. To solve the mismatch problem, we apply the same thresholding algorithm on the training data before training. The algorithm adds a negligible computational complexity at the front end, and decreases the overall computational complexity. Moreover, it outperforms other computationally comparable, well known methods. This makes the proposed system particularly suitable for real-time systems.
Bibliographic reference. Demiroglu, Cenk / Anderson, David V. (2003): "Noise robust digit recognition with missing frames", In EUROSPEECH-2003, 2165-2168.