Hierarchical Constrained Bayesian Optimization for Feature, Acoustic Model and Decoder Parameter Optimization

Akshay Chandrashekaran, Ian Lane


We describe the implementation of a hierarchical constrained Bayesian Optimization algorithm and it’s application to joint optimization of features, acoustic model structure and decoding parameters for deep neural network (DNN)-based large vocabulary continuous speech recognition (LVCSR) systems. Within our hierarchical optimization method we perform constrained Bayesian optimization jointly of feature hyper-parameters and acoustic model structure in the first-level, and then perform an iteration of constrained Bayesian optimization for the decoder hyper-parameters in the second. We show the the proposed hierarchical optimization method can generate a model with higher performance than a manually optimized system on a server platform. Furthermore, we demonstrate that the proposed framework can be used to automatically build real-time speech recognition systems for graphics processing unit (GPU)-enabled embedded platforms that retain similar accuracy to a server platform, while running with constrained computing resources.


 DOI: 10.21437/Interspeech.2017-1583

Cite as: Chandrashekaran, A., Lane, I. (2017) Hierarchical Constrained Bayesian Optimization for Feature, Acoustic Model and Decoder Parameter Optimization. Proc. Interspeech 2017, 538-542, DOI: 10.21437/Interspeech.2017-1583.


@inproceedings{Chandrashekaran2017,
  author={Akshay Chandrashekaran and Ian Lane},
  title={Hierarchical Constrained Bayesian Optimization for Feature, Acoustic Model and Decoder Parameter Optimization},
  year=2017,
  booktitle={Proc. Interspeech 2017},
  pages={538--542},
  doi={10.21437/Interspeech.2017-1583},
  url={http://dx.doi.org/10.21437/Interspeech.2017-1583}
}