In this study, we developed an off-topic response detection system to be used in the context of the automated scoring of non-native English speakers’ spontaneous speech. Based on transcriptions generated from an ASR system trained on non-native speakers’ speech and various semantic similarity features, the system classified each test response as an on-topic or off-topic response. The recent success of deep neural networks (DNN) in text similarity detection led us to explore DNN-based document similarity features. Specifically, we used a siamese adaptation of the convolutional network, due to its efficiency in learning similarity patterns simultaneously from both responses and questions used to elicit responses. In addition, a baseline system was developed using a standard vector space model (VSM) trained on sample responses for each question. The accuracy of the siamese CNN-based system was 0.97 and there was a 50% relative error reduction compared to the standard VSM-based system. Furthermore, the accuracy of the siamese CNN-based system was consistent across different questions.
Cite as: Lee, C.M., Yoon, S.-Y., Wang, X., Mulholland, M., Choi, I., Evanini, K. (2017) Off-Topic Spoken Response Detection Using Siamese Convolutional Neural Networks. Proc. Interspeech 2017, 1427-1431, doi: 10.21437/Interspeech.2017-1174
@inproceedings{lee17b_interspeech, author={Chong Min Lee and Su-Youn Yoon and Xihao Wang and Matthew Mulholland and Ikkyu Choi and Keelan Evanini}, title={{Off-Topic Spoken Response Detection Using Siamese Convolutional Neural Networks}}, year=2017, booktitle={Proc. Interspeech 2017}, pages={1427--1431}, doi={10.21437/Interspeech.2017-1174} }