INTERSPEECH 2006 - ICSLP
Ninth International Conference on Spoken Language Processing

Pittsburgh, PA, USA
September 17-21, 2006

Vector-Based Spoken Language Recognition Using Output Coding

Haizhou Li, Bin Ma, Rong Tong

Institute for Infocomm Research, Singapore

The vector-based spoken language recognition approach converts a spoken utterance into a high dimensional vector, also known as a bagof- sounds vector, that consists of n-gram statistics of acoustic units. Dimensionality reduction would better prepare the bag-of-sounds vectors for classifier design. We propose projecting the bag-of-sounds vectors onto a low dimensional SVM output coding space, where each dimension represents a decision hyperplane between a pair of spoken languages. We also compare the performances of the output coding approach and the traditional low ranking approximation approach using latent semantic indexing (LSI) on the NIST 1996, 2003 and 2005 Language Recognition Evaluation (LRE) databases. The experiments show that the output coding approach consistently outperforms LSI with competitive results.

Full Paper

Bibliographic reference.  Li, Haizhou / Ma, Bin / Tong, Rong (2006): "Vector-based spoken language recognition using output coding", In INTERSPEECH-2006, paper 1155-Mon2CaP.8.