An important component of customer call experience analysis is to distinguish different segments of a call including interactive voice response (IVR), waiting in queue, and interaction with an agent. Because segment information from telephone switches is not always available, or may be difficult to obtain, we sought a method that could perform such segmentation solely from the recorded audio. In this paper, we present a probabilistic framework for segmenting call center audio into IVR, Queue, and Agent using a suite of rich features based on both speech and non-speech content. We study different statistical classifiers such as Maximum Entropy (MaxEnt) and Conditional Random Field (CRF). We present experimental results on real-world call center data and demonstrate that the probabilistic approach achieves superior segmentation performance, and outperforms a rule-based approach, while significantly reducing the time needed to deploy the segmenter for a new call center.
Bibliographic reference. Zinovieva, Nina / Zhuang, Xiaodan / Peterson, Pat / Alwan, Joe / Prasad, Rohit (2013): "Probabilistic trainable segmenter for call center audio using multiple features", In INTERSPEECH-2013, 2054-2058.