11th Annual Conference of the International Speech Communication Association

Makuhari, Chiba, Japan
September 26-30. 2010

Hierarchical Classification for Speech-to-Speech Translation

Emil Ettelaie, Panayiotis G. Georgiou, Shrikanth S. Narayanan

University of Southern California, USA

Concept classifiers have been used in speech to speech translation systems. Their effectiveness, however, depends on the size of the domain that they cover. The main bottleneck in expanding the classifier domain is the degradation in accuracy as the number of classes increase. Here we introduce a hierarchical classification process that aims to scale up the domain without compromising the accuracy. We propose to exploit the categorical associations that naturally appear in the training data to split the domain into sub-domains with fewer classes. We use two methods of language model based classification and topic modeling with latent Dirichlet allocation to use the discourse information for sub-domain detection. The classification task is performed in two steps. First the best category for the discourse is detected using one of the above methods. Then a sub-domain classifier--limited to that category--is deployed. Empirical results from our experiments show higher accuracy for the proposed method compared to a single layered classifier.

Full Paper

Bibliographic reference.  Ettelaie, Emil / Georgiou, Panayiotis G. / Narayanan, Shrikanth S. (2010): "Hierarchical classification for speech-to-speech translation", In INTERSPEECH-2010, 2530-2533.