ISCA Archive Interspeech 2013
ISCA Archive Interspeech 2013

Information theoretic acoustic feature selection for acoustic-to-articulatory inversion

Prasanta Kumar Ghosh, Shrikanth Narayanan

We use mutual information as the criterion to rank the Mel frequency cepstral coefficients (MFCCs) and their derivatives according to the information they provide about different articulatory features in acoustic-to-articulatory (AtoA) inversion. It is found that just a small subset of the coefficients encodes maximal information about articulatory features and interestingly, this subset is articulatory feature specific. We use these subsets of MFCCs(+derivatives) in AtoA inversion using Gaussian mixture model (GMM) mapping. Inversion experiments with articulatory data support the information theoretic finding that the subsets of MFCCs(+derivatives) as selected by feature ranking method are sufficient to achieve an inversion performance similar to that obtained by a conventional full set of MFCCs(+derivatives). This drastically reduces the modeling complexity of the acoustic-articulatory map using GMM without degrading inversion performance significantly.


doi: 10.21437/Interspeech.2013-705

Cite as: Ghosh, P.K., Narayanan, S. (2013) Information theoretic acoustic feature selection for acoustic-to-articulatory inversion. Proc. Interspeech 2013, 3177-3181, doi: 10.21437/Interspeech.2013-705

@inproceedings{ghosh13_interspeech,
  author={Prasanta Kumar Ghosh and Shrikanth Narayanan},
  title={{Information theoretic acoustic feature selection for acoustic-to-articulatory inversion}},
  year=2013,
  booktitle={Proc. Interspeech 2013},
  pages={3177--3181},
  doi={10.21437/Interspeech.2013-705}
}