ISCA Archive Odyssey 2020
ISCA Archive Odyssey 2020

Small Footprint Multi-channel Keyword Spotting

Jilong Wu, Yiteng Huang, Hyun-Jin Park, Niranjan Subrahmanya, Patrick Violette

Noise robustness remains a challenging problem in on-device keyword spotting. Using multiple-microphone algorithms like beamforming improves accuracy, but it inevitably pushes up computational complexity and tends to require more memory. In this paper, we propose a new neural-network based architecture which takes multiple microphone signals as inputs. It can achieve better accuracy and incurs just a minimum increase in model size. Compared with a single-channel baseline which runs in parallel on each channel, the proposed architecture reduces the false reject (FR) rate by 36.3% and 46.4% relative on dual-microphone clean and noisy test sets, respectively, at a fixed false accept rate.


doi: 10.21437/Odyssey.2020-55

Cite as: Wu, J., Huang, Y., Park, H.-J., Subrahmanya, N., Violette, P. (2020) Small Footprint Multi-channel Keyword Spotting. Proc. The Speaker and Language Recognition Workshop (Odyssey 2020), 391-395, doi: 10.21437/Odyssey.2020-55

@inproceedings{wu20_odyssey,
  author={Jilong Wu and Yiteng Huang and Hyun-Jin Park and Niranjan Subrahmanya and Patrick Violette},
  title={{Small Footprint Multi-channel Keyword Spotting}},
  year=2020,
  booktitle={Proc. The Speaker and Language Recognition Workshop (Odyssey 2020)},
  pages={391--395},
  doi={10.21437/Odyssey.2020-55}
}