EUROSPEECH 2003 - INTERSPEECH 2003
8th European Conference on Speech Communication and Technology

Geneva, Switzerland
September 1-4, 2003

        

Speech Shift: Direct Speech-Input-Mode Switching Through Intentional Control of Voice Pitch

Masataka Goto (1), Yukihiro Omoto (2), Katunobu Itou (1), Tetsunori Kobayashi (2)

(1) AIST, Japan
(2) Waseda University, Japan

This paper describes a speech-input interface function, called speech shift, that enables a user to specify a speech-input mode by simply changing (shifting) voice pitch. While current speech-input interfaces have used only verbal information, we aimed at building a more user-friendly speech interface by making use of nonverbal information, the voice pitch. By intentionally controlling the pitch, a user can enter the same word with it having different meanings (functions) without explicitly changing the speech-input mode. Our speech-shift function implemented on a voice-enabled word processor, for example, can distinguish an utterance with a high pitch from one with a normal (low) pitch, and regard the former as voice-command-mode input (such as file-menu and edit-menu commands) and the latter as regular dictation-mode text input. Our experimental results from twenty subjects showed that the speech-shift function is effective, easy to use, and a labor-saving input method.

Full Paper

Bibliographic reference.  Goto, Masataka / Omoto, Yukihiro / Itou, Katunobu / Kobayashi, Tetsunori (2003): "Speech shift: direct speech-input-mode switching through intentional control of voice pitch", In EUROSPEECH-2003, 1201-1204.