ISCA Archive Interspeech 2015
ISCA Archive Interspeech 2015

Under-resourced speech recognition based on the speech manifold

Reza Sahraeian, Dirk Van Compernolle, Febe de Wet

Conventional acoustic modeling involves estimating many parameters to effectively model feature distributions. The sparseness of speech and text data, however, degrades the reliability of the estimation process and makes speech recognition a challenging task. In this paper, we propose to use a nonlinear feature transformation based on the speech manifold called Intrinsic Spectral Analysis (ISA) for under-resourced speech recognition. First, we investigate the usefulness of ISA features in low resource scenarios for both Gaussian mixture and deep neural network (DNN) acoustic modeling. Moreover, due to the connection of ISA features to the articulatory configuration space, this feature space is potentially less language dependent than other typical spectral-based features, and therefore exploiting out-of-language data in this feature space is beneficial. We demonstrate the positive effect of ISA in the frame work of multilingual DNN systems where Flemish and Afrikaans are used as donor and under-resourced target languages respectively. We compare the performance of ISA with conventional features in both multilingual and under-resourced monolingual conditions.

doi: 10.21437/Interspeech.2015-315

Cite as: Sahraeian, R., Compernolle, D.V., Wet, F.d. (2015) Under-resourced speech recognition based on the speech manifold. Proc. Interspeech 2015, 1255-1259, doi: 10.21437/Interspeech.2015-315

  author={Reza Sahraeian and Dirk Van Compernolle and Febe de Wet},
  title={{Under-resourced speech recognition based on the speech manifold}},
  booktitle={Proc. Interspeech 2015},