We compare a variety of strategies for incorporating spelling to create more robust voice-only speech interfaces. These strategies use different combinations of speaking the word, spelling the word, and spelling the word using a phonetic alphabet. For correcting a single recognition error, spelling the word or speaking and spelling the word reduced error rates substantially. Phonetic-spelling was very accurate with error rates on a 5K task approaching zero. Most importantly, multiple input strategies could be used simultaneously with only a modest degradation in performance compared to allowing only a single input strategy. Thus our work shows that spelling-based input strategies offer the potential of a simple, natural and effective way for users to both avoid and correct recognition errors.
Index Terms: speech recognition, error correction
Bibliographic reference. Vertanen, Keith / Kristensson, Per Ola (2012): "Spelling as a complementary strategy for speech recognition", In INTERSPEECH-2012, 2294-2297.