This paper describes a novel classification method for multi-stream conversational documents. Documents of contact center dialogues or meetings are often composed of multiple source documents that are transcriptions of the recordings of each speaker’s channel. To enhance the classification performance of such multi-stream conversational documents, three main advances over the previous method are introduced. The first is a parallel hierarchical attention network (PHAN) for multi-stream conversational document modeling. PHAN can precisely capture word and sentence structures of individual source documents and efficiently integrate them. The second is a shared memory reader that can yield a shared attention mechanism. The shared memory reader highlights common important information in a conversation. Our experiments on a call category classification in contact center dialogues show that PHAN together with the shared memory reader outperforms the single document modeling method and previous multi-stream document modeling method.
Cite as: Sawada, N., Masumura, R., Nishizaki, H. (2017) Parallel Hierarchical Attention Networks with Shared Memory Reader for Multi-Stream Conversational Document Classification. Proc. Interspeech 2017, 3311-3315, doi: 10.21437/Interspeech.2017-269
@inproceedings{sawada17_interspeech, author={Naoki Sawada and Ryo Masumura and Hiromitsu Nishizaki}, title={{Parallel Hierarchical Attention Networks with Shared Memory Reader for Multi-Stream Conversational Document Classification}}, year=2017, booktitle={Proc. Interspeech 2017}, pages={3311--3315}, doi={10.21437/Interspeech.2017-269} }