INTERSPEECH 2004 - ICSLP
In this paper we propose a voice activation method based on prosodic keyword verification. In current voice activation systems features like the fundamental frequency contour have not been considered so far. Normally a continuous listening word spotter is used to detect a certain predefined keyword. We conducted an experiment which shows that people emphasize this keyword when they address a recognizer. To capture the prosodic information we trained an HMM on the fundamental frequency and energy contour of the keyword. The prosodic model is used to verify the keyword hypotheses of a phonetic recognizer. We investigated the performance of the prosodic model to distinguish between the same keyword spoken in command and non-command phrases. The introduction of the prosodic information significantly reduced the false alarm rate whereas the detection rate was only slightly degraded.
Bibliographic reference. Kühne, Marco / Wolff, Matthias / Eichner, Matthias / Hoffmann, Rüdiger (2004): "Voice activation using prosodic features", In INTERSPEECH-2004, 3001-3004.