14thAnnual Conference of the International Speech Communication Association

Lyon, France
August 25-29, 2013

Probabilistic Trainable Segmenter for Call Center Audio Using Multiple Features

Nina Zinovieva, Xiaodan Zhuang, Pat Peterson, Joe Alwan, Rohit Prasad

Raytheon BBN Technologies, USA

An important component of customer call experience analysis is to distinguish different segments of a call including interactive voice response (IVR), waiting in queue, and interaction with an agent. Because segment information from telephone switches is not always available, or may be difficult to obtain, we sought a method that could perform such segmentation solely from the recorded audio. In this paper, we present a probabilistic framework for segmenting call center audio into IVR, Queue, and Agent using a suite of rich features based on both speech and non-speech content. We study different statistical classifiers such as Maximum Entropy (MaxEnt) and Conditional Random Field (CRF). We present experimental results on real-world call center data and demonstrate that the probabilistic approach achieves superior segmentation performance, and outperforms a rule-based approach, while significantly reducing the time needed to deploy the segmenter for a new call center.

Full Paper

Bibliographic reference.  Zinovieva, Nina / Zhuang, Xiaodan / Peterson, Pat / Alwan, Joe / Prasad, Rohit (2013): "Probabilistic trainable segmenter for call center audio using multiple features", In INTERSPEECH-2013, 2054-2058.