ISCA Archive Interspeech 2009
ISCA Archive Interspeech 2009

Profiling large-vocabulary continuous speech recognition on embedded devices: a hardware resource sensitivity analysis

Kai Yu, Rob A. Rutenbar

When deployed in embedded systems, speech recognizers are necessarily reduced from large-vocabulary continuous speech recognizers (LVCSR) found on desktops or servers to fit the limited hardware. However, embedded hardware continues to evolve in capability; today’s smartphones are vastly more powerful than their recent ancestors. This begets a new question: which hardware features not currently found on today’s embedded platforms, but potentially add-ons to tomorrow’s devices, are most likely to improve recognition performance? Said differently — what is the sensitivity of the recognizer to fine-grain details of the embedded hardware resources? To answer this question rigorously and quantitatively, we offer results from a detailed study of LVCSR performance as a function of micro-architecture options on an embedded ARM11 and an enterprise-class Intel Core2Duo. We estimate speed and energy consumption, and show, feature by feature, how hardware resources impact recognizer performance.


doi: 10.21437/Interspeech.2009-556

Cite as: Yu, K., Rutenbar, R.A. (2009) Profiling large-vocabulary continuous speech recognition on embedded devices: a hardware resource sensitivity analysis. Proc. Interspeech 2009, 1923-1926, doi: 10.21437/Interspeech.2009-556

@inproceedings{yu09c_interspeech,
  author={Kai Yu and Rob A. Rutenbar},
  title={{Profiling large-vocabulary continuous speech recognition on embedded devices: a hardware resource sensitivity analysis}},
  year=2009,
  booktitle={Proc. Interspeech 2009},
  pages={1923--1926},
  doi={10.21437/Interspeech.2009-556}
}