Word spotting, or keyword identification, is a highly challenging task when there are multiple speakers speaking simultaneously. In the case of a game being controlled by children solely through voice, the task becomes extremely difficult. Children, unlike adults, typically do not await their turn to speak in an orderly fashion. They interrupt and shout at arbitrary times, speak or say things that are not within the purview of the game vocabulary, arbitrarily stretch, contract, distort or rapid-repeat words, and do not stay in one location either horizontally or vertically. Consequently, standard state-of-art keyword spotting systems that work admirably for adults in multiple keyword settings, fail to perform well even in a basic two-word vocabulary keyword spotting task in the case of children. This paper highlights the issues with keyword spotting using a simple two-word game played by children of different age groups, and gives quantitative performance assessments using a novel keyword spotting technique that is especially suited to such scenarios.
Bibliographic reference. Sundar, Harshavardhan / Lehman, Jill Fain / Singh, Rita (2015): "Keyword spotting in multi-player voice driven games for children", In INTERSPEECH-2015, 1660-1664.