CRIM’s Speech Transcription and Call Sign Detection System for the ATC Airbus Challenge Task

Vishwa Gupta, Lise Rebout, Gilles Boulianne, Pierre-André Ménard, Jahangir Alam


The Airbus air traffic control challenge evaluates speech recognition and call sign detection using real conversations between air traffic controllers and pilots at Toulouse airport in France. CRIM’s main contribution in acoustic modeling for transcribing these conversations is experimentation with bidirectional LSTM (BLSTM) models and lattice-free MMI (LF-MMI) trained TDNN models. Adapting these acoustic models trained from a large dataset to 40 hours of ATC acoustic training data reduces WER significantly compared to training them with the ATC data only. Multiple iterations of adaptation reduce WER for the BLSTM acoustic models significantly, but only marginally for the LF-MMI TDNN acoustic models. Constrained dialog between the air traffic controller and the pilot leads to language model perplexity below 12, and WER for leaderboard and evaluation sets of 9.98% and 9.41% respectively.

For call sign detection from the decoded transcript, we use a bidirectional LSTM followed by conditional random field classifier. This DNN architecture worked better than a finite state transducer based call sign detection. Taking a majority vote over call signs from multiple decodes reduced the call sign errors. The best F1 for call sign detection for leaderboard was 0.8289 and for evaluation 0.8017. Overall, we came 3rd in this evaluation.


 DOI: 10.21437/Interspeech.2019-1131

Cite as: Gupta, V., Rebout, L., Boulianne, G., Ménard, P., Alam, J. (2019) CRIM’s Speech Transcription and Call Sign Detection System for the ATC Airbus Challenge Task. Proc. Interspeech 2019, 3018-3022, DOI: 10.21437/Interspeech.2019-1131.


@inproceedings{Gupta2019,
  author={Vishwa Gupta and Lise Rebout and Gilles Boulianne and Pierre-André Ménard and Jahangir Alam},
  title={{CRIM’s Speech Transcription and Call Sign Detection System for the ATC Airbus Challenge Task}},
  year=2019,
  booktitle={Proc. Interspeech 2019},
  pages={3018--3022},
  doi={10.21437/Interspeech.2019-1131},
  url={http://dx.doi.org/10.21437/Interspeech.2019-1131}
}