Low-Rank Representation of Nearest Neighbor Posterior Probabilities to Enhance DNN Based Acoustic Modeling

Gil Luyet, Pranay Dighe, Afsaneh Asaei, Hervé Bourlard


We hypothesize that optimal deep neural networks (DNN) class-conditional posterior probabilities live in a union of low-dimensional subspaces. In real test conditions, DNN posteriors encode uncertainties which can be regarded as a superposition of unstructured sparse noise over the optimal posteriors. We aim to investigate different ways to structure the DNN outputs by exploiting low-rank representation (LRR) techniques. Using a large number of training posterior vectors, the underlying low-dimensional subspace of a test posterior is identified through nearest neighbor analysis, and low-rank decomposition enables separation of the “optimal” posteriors from the spurious uncertainties at the DNN output. Experiments demonstrate that by processing subsets of posteriors which possess strong subspace similarity, low-rank representation enables enhancement of posterior probabilities, and leads to higher speech recognition accuracy based on the hybrid DNN-hidden Markov model (HMM) system.


DOI: 10.21437/Interspeech.2016-1279

Cite as

Luyet, G., Dighe, P., Asaei, A., Bourlard, H. (2016) Low-Rank Representation of Nearest Neighbor Posterior Probabilities to Enhance DNN Based Acoustic Modeling. Proc. Interspeech 2016, 3449-3453.

Bibtex
@inproceedings{Luyet+2016,
author={Gil Luyet and Pranay Dighe and Afsaneh Asaei and Hervé Bourlard},
title={Low-Rank Representation of Nearest Neighbor Posterior Probabilities to Enhance DNN Based Acoustic Modeling},
year=2016,
booktitle={Interspeech 2016},
doi={10.21437/Interspeech.2016-1279},
url={http://dx.doi.org/10.21437/Interspeech.2016-1279},
pages={3449--3453}
}