Assessing the Semantic Space Bias Caused by ASR Error Propagation and its Effect on Spoken Document Summarization

Máté Ákos Tündik, Valér Kaszás, György Szaszák


Ambitions in artificial intelligence involve machine understanding of human language. The state-of-the-art approach for Spoken Language Understanding is using an Automatic Speech Recognizer (ASR) to generate transcripts, which are further processed with text-based tools. ASR yields error prone transcripts, these errors then propagate further into the processing pipeline. Subjective tests show on the other hand, that humans understand quite well ASR closed captions despite the word and punctuation errors. Our goal is to assess and quantify the loss in the semantic space resulting from error propagation and also analyze error propagation into speech summarization as a special use-case. We show, that word errors cause a slight shift in the semantic space, which is fairly below the average semantic distance between the sentences within a document. We also show, that punctuation errors have higher impact on summarization performance, which suggests that proper sentence level tokenization is crucial for this task.


 DOI: 10.21437/Interspeech.2019-2154

Cite as: Tündik, M.Á., Kaszás, V., Szaszák, G. (2019) Assessing the Semantic Space Bias Caused by ASR Error Propagation and its Effect on Spoken Document Summarization. Proc. Interspeech 2019, 1333-1337, DOI: 10.21437/Interspeech.2019-2154.


@inproceedings{Tündik2019,
  author={Máté Ákos Tündik and Valér Kaszás and György Szaszák},
  title={{Assessing the Semantic Space Bias Caused by ASR Error Propagation and its Effect on Spoken Document Summarization}},
  year=2019,
  booktitle={Proc. Interspeech 2019},
  pages={1333--1337},
  doi={10.21437/Interspeech.2019-2154},
  url={http://dx.doi.org/10.21437/Interspeech.2019-2154}
}