Symbol Sequence Search from Telephone Conversation

Masayuki Suzuki, Gakuto Kurata, Abhinav Sethy, Bhuvana Ramabhadran, Kenneth W. Church, Mark Drake


We propose a method for searching for symbol sequences in conversations. Symbol sequences can include phone numbers, credit card numbers, and any kind of ticket (identification) numbers and are often communicated in call center conversations. Automatic extraction of these from speech is a key to many automatic speech recognition (ASR) applications such as question answering and summarization. Compared with spoken term detection (STD), symbol sequence searches have two additional problems. First, the entire symbol sequence is typically not observed continuously but in sub sequences, where customers or agents speak these sequences in fragments, while the recipient repeats them to ensure they have the correct sequence. Second, we have to distinguish between different symbol sequences, for example, phone numbers versus ticket numbers or customer identification numbers. To deal with these problems, we propose to apply STD to symbol-sequence fragments and subsequently use confidence scoring to obtain the entire symbol sequence. For the confidence scoring, We propose a long short-term memory (LSTM) based approach that inputs word before and after fragments. We also propose to detect repetitions of fragments and use it for confidence scoring. Our proposed method achieves a 0.87 F-measure, in an eight-digit customer identification number search task, when operating at 20.3% WER.


 DOI: 10.21437/Interspeech.2017-904

Cite as: Suzuki, M., Kurata, G., Sethy, A., Ramabhadran, B., Church, K.W., Drake, M. (2017) Symbol Sequence Search from Telephone Conversation. Proc. Interspeech 2017, 3612-3616, DOI: 10.21437/Interspeech.2017-904.


@inproceedings{Suzuki2017,
  author={Masayuki Suzuki and Gakuto Kurata and Abhinav Sethy and Bhuvana Ramabhadran and Kenneth W. Church and Mark Drake},
  title={Symbol Sequence Search from Telephone Conversation},
  year=2017,
  booktitle={Proc. Interspeech 2017},
  pages={3612--3616},
  doi={10.21437/Interspeech.2017-904},
  url={http://dx.doi.org/10.21437/Interspeech.2017-904}
}