An Adaptive Multi-Band System for Low Power Voice Command Recognition

Qing He, Gregory W. Wornell, Wei Ma


A complete voice-driven experience in applications such as wearable electronics requires always-on keyword monitoring, which is prohibitively power consuming using current speech recognition methods. In this work, we propose an ultra-low power voice command recognition system that is designed to recognize short commands such as ‘Hi Galaxy’. To achieve power-efficient designs, the system uses adaptive feature pre-selection such that only a subset of all available features are selected and extracted based on the noise spectrum. The back-end classifier, supporting adaptive feature selection, is enabled by a novel multi-band deep neural networks (DNNs) model that processes only the selected features at each decision. In experiments, our adaptive scheme achieves comparable accuracy and improved efficiency using an average of 5 spectral feature bands, than a generic fully-connected DNNs model using the full speech spectrum. The system makes a recognition decision every 40ms on 1.2s of buffered speech and consumes ~230µW of power, thus promising low-power, low-complexity and robust application-specific voice recognition.


DOI: 10.21437/Interspeech.2016-1562

Cite as

He, Q., Wornell, G.W., Ma, W. (2016) An Adaptive Multi-Band System for Low Power Voice Command Recognition. Proc. Interspeech 2016, 1888-1892.

Bibtex
@inproceedings{He+2016,
author={Qing He and Gregory W. Wornell and Wei Ma},
title={An Adaptive Multi-Band System for Low Power Voice Command Recognition},
year=2016,
booktitle={Interspeech 2016},
doi={10.21437/Interspeech.2016-1562},
url={http://dx.doi.org/10.21437/Interspeech.2016-1562},
pages={1888--1892}
}