INTERSPEECH 2004 - ICSLP
Behavioral synchronization between speech and finger tapping provides a novel approach to the improvement of speech recognition accuracy. We combine a sequence of finger tapping timings recorded alongside an utterance using two distinct methods: in the first method, HMM state transition probabilities at the word boundaries are controlled by the timing of the finger tapping; in the second, the probability (relative frequency) of the finger tapping is used as a 'feature' and combined with MFCC in a HMM recognition system. We evaluate these methods through connected digit recognition under different noise conditions (AURORA-2J) and LVCSR tasks. Leveraging the synchrony between speech and finger tapping provides a 46 % relative improvement and a 1 % absolute improvement in connected digit recognition experiments and LVCSR experiments, respectively.
Bibliographic reference. Ban, Hiromitsu / Miyajima, Chiyomi / Itou, Katsunobu / Itakura, Fumitada / Takeda, Kazuya (2004): "Speech recognition using synchronization between speech and finger tapping", In INTERSPEECH-2004, 2289-2292.