SAPA-SCALE Conference 2012

Portland, OR, USA
September 7-8, 2012

Language Identification using Spectro-Temporal Patch Features

Kamal Sahni (1), Pranay Dighe (2), Rita Singh (3), Bhiksha Raj (3)

(1) Department of Electrical Engineering, Indian Institute of Technology Kanpur, India
(2) Department of Computer Science & Engineering, Indian Institute of Technology Kanpur, India
(3) Laugauge Technologies Institute, Carnegie Mellon University, USA

We present a novel approach for automatic Language Identification (LID) using spectro-temporal patch features. Our approach is based on the premise that speech and spoken phenomena are characterized by typical visible patterns in time-frequency representations of the signal, and that the manner of occurrence of these patterns is language specific. To model this, we derive a randomly selected library of spectro-temporal patterns from spoken examples from a language, and derive features from the correlations of this library to spectrograms derived from the speech signal. Under our hypothesis, the relative frequency of correlation peaks must be different for different languages. We model this by learning a discriminative classifier based on these features to detect the presence of the language in a recording. The proposed approach has been tested on two different datasets: the VoxForge multilingual speech data and CallFriend corpus available from the Linguistic Data Consortium (LDC).

Index Terms: Language identification, Spectro-temporal patches, Discriminative classification

Full Paper

Bibliographic reference.  Sahni, Kamal / Dighe, Pranay / Singh, Rita / Raj, Bhiksha (2012): "Language identification using spectro-temporal patch features", In SAPA-SCALE-2012, 110-113.