ISCA Archive Interspeech 2006
ISCA Archive Interspeech 2006

Vector-based spoken language recognition using output coding

Haizhou Li, Bin Ma, Rong Tong

The vector-based spoken language recognition approach converts a spoken utterance into a high dimensional vector, also known as a bagof- sounds vector, that consists of n-gram statistics of acoustic units. Dimensionality reduction would better prepare the bag-of-sounds vectors for classifier design. We propose projecting the bag-of-sounds vectors onto a low dimensional SVM output coding space, where each dimension represents a decision hyperplane between a pair of spoken languages. We also compare the performances of the output coding approach and the traditional low ranking approximation approach using latent semantic indexing (LSI) on the NIST 1996, 2003 and 2005 Language Recognition Evaluation (LRE) databases. The experiments show that the output coding approach consistently outperforms LSI with competitive results.

doi: 10.21437/Interspeech.2006-139

Cite as: Li, H., Ma, B., Tong, R. (2006) Vector-based spoken language recognition using output coding. Proc. Interspeech 2006, paper 1155-Mon2CaP.8, doi: 10.21437/Interspeech.2006-139

  author={Haizhou Li and Bin Ma and Rong Tong},
  title={{Vector-based spoken language recognition using output coding}},
  booktitle={Proc. Interspeech 2006},
  pages={paper 1155-Mon2CaP.8},