The development of a module for speech recognition and answer generation for preschool children for a speech-oriented guidance system is described. This topic requires extra treatment because the performance is still disproportionally low to children of higher age, there is a growing business demand and only relatively few research on preschool children ASR has been carried out. This is especially true for building practical applications. A real-environment speech database with more than 12,000 utterances of Japanese preschool children and more than 60,000 utterances of school children are employed for system development. The difference between preschool children's and standard pronunciation is narrowed by introducing uniform reference transcriptions and pronunciation modeling. Furthermore, language and acoustic model are optimized. Final evaluation shows, that the speech-oriented guidance system's response accuracy can be improved by more than 12% absolute.
Bibliographic reference. Cincarek, Tobias / Shindo, Izumi / Toda, Tomoki / Saruwatari, Hiroshi / Shikano, Kiyohiro (2007): "Development of preschool children subsystem for ASR and q&a in a real-environment speech-oriented guidance task", In INTERSPEECH-2007, 1469-1472.