We address the problem of detection of emotional states of obsessive-compulsive disorder (OCD) patients in an exposureresponse prevention (ERP) therapy protocol scenario. Here, it is required to identify the emotional levels of a patient at a granular level needed for successful progression of the therapy, and one of the major hurdles in this is the so called alexithymia (subclinical inability to identify emotions in the self). Alternately, we propose estimating the emotional state of an OCD patient automatically from raw speech signal, elicited under a situation-based emotion entry to an on-line therapy aid. Towards this, we propose a novel multi-temporal CNN architecture for end-to-end ‘speech emotion recognition’ (SER) from raw speech signal. The proposed architecture allows for multiple time-frequency resolutions with multiple filter banks having different time-frequency resolutions to create feature-maps (ranging from very narrow-band to very wide-band spectrographic maps in steps of fine time-frequency resolutions). On SER task, we show 2-8% absolute enhancement in accuracy for the multi-temporal cases (e.g. 3, 6 branches) over the conventional single-temporal CNNs. As a position paper, we identify further work as fine-granular emotion detection of the OCD emotional states via a valence-arousal-dominance detection to derive the ‘degree’ of emotion of an OCD patient.
Cite as: Gupta, K., Zulfiqar, A., Ramu, P., Purohit, T., Ramasubramanian, V. (2019) Detection of emotional states of OCD patients in an exposure-response prevention therapy scenario. Proc. Workshop on Speech, Music and Mind (SMM 2019), 21-25, doi: 10.21437/SMM.2019-5
@inproceedings{gupta19_smm, author={Kaajal Gupta and Anzar Zulfiqar and Pushpa Ramu and Tilak Purohit and V. Ramasubramanian}, title={{Detection of emotional states of OCD patients in an exposure-response prevention therapy scenario}}, year=2019, booktitle={Proc. Workshop on Speech, Music and Mind (SMM 2019)}, pages={21--25}, doi={10.21437/SMM.2019-5} }