Using Prosodic and Lexical Information for Learning Utterance-level Behaviors in Psychotherapy

Karan Singla, Zhuohao Chen, Nikolaos Flemotomos, James Gibson, Dogan Can, David Atkins, Shrikanth Narayanan


In this paper, we present an approach for predicting utterance level behaviors in psychotherapy sessions using both speech and lexical features. We train long short term memory (LSTM) networks with an attention mechanism using words, both manually and automatically transcribed and prosodic features, at the word level, to predict the annotated behaviors. We demonstrate that prosodic features provide discriminative information relevant to the behavior task and show that they improve prediction when fused with automatically derived lexical features. Additionally, we investigate the weights of the attention mechanism to determine words and prosodic patterns which are of importance to the behavior prediction task.


 DOI: 10.21437/Interspeech.2018-2551

Cite as: Singla, K., Chen, Z., Flemotomos, N., Gibson, J., Can, D., Atkins, D., Narayanan, S. (2018) Using Prosodic and Lexical Information for Learning Utterance-level Behaviors in Psychotherapy. Proc. Interspeech 2018, 3413-3417, DOI: 10.21437/Interspeech.2018-2551.


@inproceedings{Singla2018,
  author={Karan Singla and Zhuohao Chen and Nikolaos Flemotomos and James Gibson and Dogan Can and David Atkins and Shrikanth Narayanan},
  title={Using Prosodic and Lexical Information for Learning Utterance-level Behaviors in Psychotherapy},
  year=2018,
  booktitle={Proc. Interspeech 2018},
  pages={3413--3417},
  doi={10.21437/Interspeech.2018-2551},
  url={http://dx.doi.org/10.21437/Interspeech.2018-2551}
}