ISCA Archive Interspeech 2009
ISCA Archive Interspeech 2009

Robust in-car spelling recognition - a tandem BLSTM-HMM approach

Martin Wöllmer, Florian Eyben, Björn Schuller, Yang Sun, Tobias Moosmayr, Nhu Nguyen-Thien

As an intuitive hands-free input modality automatic spelling recognition is especially useful for in-car human-machine interfaces. However, for today’s speech recognition engines it is extremely challenging to cope with similar sounding spelling speech sequences in the presence of noises such as the driving noise inside a car. Thus, we propose a novel Tandem spelling recogniser, combining a Hidden Markov Model (HMM) with a discriminatively trained bidirectional Long Short-Term Memory (BLSTM) recurrent neural net. The BLSTM network captures long-range temporal dependencies to learn the properties of in-car noise, which makes the Tandem BLSTM-HMM robust with respect to speech signal disturbances at extremely low signal-to-noise ratios and mismatches between training and test noise conditions. Experiments considering various driving conditions reveal that our Tandem recogniser outperforms a conventional HMM by up to 33%.

doi: 10.21437/Interspeech.2009-375

Cite as: Wöllmer, M., Eyben, F., Schuller, B., Sun, Y., Moosmayr, T., Nguyen-Thien, N. (2009) Robust in-car spelling recognition - a tandem BLSTM-HMM approach. Proc. Interspeech 2009, 2507-2510, doi: 10.21437/Interspeech.2009-375

  author={Martin Wöllmer and Florian Eyben and Björn Schuller and Yang Sun and Tobias Moosmayr and Nhu Nguyen-Thien},
  title={{Robust in-car spelling recognition - a tandem BLSTM-HMM approach}},
  booktitle={Proc. Interspeech 2009},